COMP 364: Internet I/O

Monday October 30, 2017

Carlos G. Oliver, Christopher J.F Cameron

Before we start

  • Assignment 3 is OUT!
  • Due: Monday Nov, 13 at 11:59 PM
  • Midterm grades will be out very soon.
  • If you would like to see your exam, please come to office hours

Outline: Internet I/O

  • What is the internet and how does it work?
  • Interacting with the internet using Python

What is the internet and how does it work?

The interent is a network of connected computers

  • Originally used as a way for scientists to collaborate and communicate.
  • The internet follows a standard data transmission protocol called TCP/IP which lets any computer communicate to any other computer.
  • In 1989 Tim Berners Lee is credited with inventing the World Wide Web and hosting the first web server.

The internet

  • Some computers on the internet are servers

  • Others are clients (typically us)

  • Servers fulfill requests from clients
  • Example: you connect to the Google servers (via URL mapped to IP address) and request a page with links to pictures of dogs.

The internet and Python

  • Normally we send requests to the server through our browser by clicking on buttons or typing in URLS
  • Since we are good at Python, we can automate this process.
  • More specifically, we can use Python to interact with Web APIs.

Web APIs

  • A Web API is a set of functions implemented on the server side that can be called by the client to either GET data from the server or PUT data on the server.
  • Web APIs let us programmatically retrieve and place data form/to the server without having to manually click through browser buttons or having to
  • This also means that servers can host huge datasets without us having to download everything. We can just request a small piece we are interested in. e.g. I want only instagram pictures with the hashtag #YOLO

Python's requests module

  • Python has a very useful module that lets us manage these requests, called requests.

>>> conda install requests

In [35]:
import requests

The International Space Station API

  • NASA has a website (aka web server) http://open-notify.org/ that provides up to date information on the status of the International Space Station.
  • We can access different parts of the API or (endpoints) depending on the kind of information we want to retrieve.
  • For today let's try to find out 3 things:
    • Where is the ISS at this moment?
    • How many astronauts are in space right now?
    • When will the ISS fly over our location?
  • The way we talk to the API is through the URL string. We encode our request into the URL string, the server API interprets it, and sends back a response.
  • If we want to receive information from the server, we use the requests.get method which calls the API's GET function.
  • The iss-now.json endpoint tells us where the space station is at this moment.
In [36]:
base_url = "http://api.open-notify.org/"
#get current ISS location using the iss-now.json endpoint
response = requests.get(base_url + "iss-now.json")

The Response object

  • Sending a request to the server with the requests module produces a Response object containing the server's answer.
  • Here are some useful attributes of the Response object
Attribute Description
status_code Number representing status of request (200 means success, 404 means failed)
content Data received from server
url URL string sent to server
headers Extra info on how the data was generated
In [37]:
print(response.url)
print(response.status_code)
print(response.content)
print(response.headers)
http://api.open-notify.org/iss-now.json
200
b'{"message": "success", "timestamp": 1509373604, "iss_position": {"longitude": "-51.1057", "latitude": "47.1351"}}'
{'Server': 'nginx/1.10.3', 'Date': 'Mon, 30 Oct 2017 14:26:44 GMT', 'Content-Type': 'application/json', 'Content-Length': '113', 'Connection': 'keep-alive', 'access-control-allow-origin': '*'}
In [18]:
print(type(response.content))
<class 'bytes'>
  • Oops. The content looks like a dictionary but it's actually a byte sequence (similar to a string).

  • This is because talking to APIs always happens through strings. Not very useful when working inside your code. We want to say content['iss_position']['longitude']. So we have to convert it to a dictionary.

  • API strings are often formatted in JSON format. Basically like a python dictionary, with key:value pairs

We can convert the response data to a dict using the json module.

In [19]:
import json

r = response.json()
In [20]:
r
Out[20]:
{'iss_position': {'latitude': '-47.7310', 'longitude': '14.0044'},
 'message': 'success',
 'timestamp': 1509315256}
In [21]:
print(f"The space station is currently at {r['iss_position']}")
The space station is currently at {'longitude': '14.0044', 'latitude': '-47.7310'}
In [21]:
from IPython.display import Image
Image(url="https://media1.tenor.com/images/5f1c69725f33736abdbcbe52d059ad59/tenor.gif")
Out[21]:

Passing parameters to APIs

  • Often APIs take arguments (just like regular Python functions)
  • These are handled as key:value pairs and passed with the params= keyword argument of requests.get.
  • Let's figure out when the ISS will next pass over Montreal
  • We will use the iss-pass endpoint. With Montreal's coordinates as parameters: 45.5017° N, 73.5673° W
In [30]:
montreal = {'lat': 45.5017, 'lon': -73.5673 }
response = requests.get(base_url + 'iss-pass.json', params=montreal)
In [34]:
response.status_code
Out[34]:
200

We could have typed up the URL manually and entered it into the browser as it appears below.

In [37]:
response.url
Out[37]:
'http://api.open-notify.org/iss-pass.json?lat=45.5017&lon=-73.5673'
In [38]:
response.content
Out[38]:
b'{\n  "message": "success", \n  "request": {\n    "altitude": 100, \n    "datetime": 1509318078, \n    "latitude": 45.5017, \n    "longitude": -73.5673, \n    "passes": 5\n  }, \n  "response": [\n    {\n      "duration": 499, \n      "risetime": 1509355748\n    }, \n    {\n      "duration": 636, \n      "risetime": 1509361433\n    }, \n    {\n      "duration": 623, \n      "risetime": 1509367238\n    }, \n    {\n      "duration": 615, \n      "risetime": 1509373064\n    }, \n    {\n      "duration": 639, \n      "risetime": 1509378863\n    }\n  ]\n}\n'
In [40]:
r = response.json()
print(r)
{'message': 'success', 'request': {'altitude': 100, 'datetime': 1509318078, 'latitude': 45.5017, 'longitude': -73.5673, 'passes': 5}, 'response': [{'duration': 499, 'risetime': 1509355748}, {'duration': 636, 'risetime': 1509361433}, {'duration': 623, 'risetime': 1509367238}, {'duration': 615, 'risetime': 1509373064}, {'duration': 639, 'risetime': 1509378863}]}
In [46]:
import datetime
for p in r['response']:
    #convert from UNIX time https://en.wikipedia.org/wiki/Unix_time
    date = datetime.datetime.fromtimestamp(int(p['risetime']))
    print(date)
2017-10-30 05:29:08
2017-10-30 07:03:53
2017-10-30 08:40:38
2017-10-30 10:17:44
2017-10-30 11:54:23

How many astronauts are in space right now?

In [47]:
astronauts = requests.get(base_url + "astros.json")
print(astronauts.json())
{'number': 6, 'message': 'success', 'people': [{'name': 'Sergey Ryazanskiy', 'craft': 'ISS'}, {'name': 'Randy Bresnik', 'craft': 'ISS'}, {'name': 'Paolo Nespoli', 'craft': 'ISS'}, {'name': 'Alexander Misurkin', 'craft': 'ISS'}, {'name': 'Mark Vande Hei', 'craft': 'ISS'}, {'name': 'Joe Acaba', 'craft': 'ISS'}]}

There are TONS of free APIs

The POST method

  • So far we've only been getting data from the server. But we can also send data for the server to store.

  • This is known as the POST method

  • Let's use the Twitter API to practice posting data.

  • Here are the docs for the API. Will come in handy.

  • Twitter is more choosy with who uses their API so I had to register and obtain an access key from them. In subsequenct API calls I will use the auth token to get permission. (don't worry about this part)
  • Large APIs often like to keep track of their users to make sure they are not abusing the server. Making too many requests can crash the server. Most APIs limit the number of requests per hour.
In [10]:
from carlos_auth import get_auth

auth = get_auth()

Searching on Twitter (GET)

  • The auth= keyword will take my access token
  • The params= keyword will take info about my search query
In [5]:
base_url = 'https://api.twitter.com/'
# search for Pizza, recent tweets, and 2 of them
search_params = {
    'q': 'Pizza',
    'result_type': 'recent',
    'count': 2
}

search_url = f'{base_url}1.1/search/tweets.json'

search_resp = requests.get(search_url, auth=auth, params=search_params)
In [6]:
print(search_resp.url)
print(search_resp.status_code)
https://api.twitter.com/1.1/search/tweets.json?q=Pizza&result_type=recent&count=2
200

The search_resp object contains a lot of info. If we want just the tweet test we go to search_resp['statuses'] which gives us a list of tweets, and for each tweet we access the text field.

In [7]:
tweets = search_resp.json()
for x in tweets['statuses']:
    print(x['text'] + '\n')
Do u think about me as much as I think about u?

RT @lebaenesepapi: Pineapple and non-pineapple pizza eaters must put our differences aside and join forces to defeat this evil https://t.co…

POST a Tweet!

In [47]:
post_params = {'status': "Hi from COMP 364! #IAMAGOD"}
post_url = 'https://api.twitter.com/1.1/statuses/update.json'

post_tweet = requests.post(post_url, auth=auth, params=post_params)
In [48]:
post_tweet.status_code
Out[48]:
200
In [28]:
Image(url="https://media.sticker.market/gif/sleepy-goodnight-good-night-584f425fdb58323238889c67-g.gif")
Out[28]: