I'm not consistent about self-publishing. But sometimes stuff strikes me and I'm pretty sure there's something cool to share. For awhile now I've thought that sharing Python snippets would be an easy thing to throw onto a blog and regularly.
To kick that off here's an bit on API feeds -- namely retrieving them. Anymore I use Python and for about a year now Python 3.X. Most API pulls are usually part of a data analysis or presentation project and as such nobody wants to waste more time pre-coding rigid and overly complicated data structures and for certain not going through a lot of effort just to receive that data. Working over the data is what you want to get on with.
So, for awhile now I've used something like the following Python class for retrieving XML or JSON. The nice thing is that urllib and xmltodict have made it fairly transparent to process either XML or JSON for about 99% of anything I need to access (barring minor tweaks).
import urllib.request
import urllib.response
from urllib.error import HTTPError
import json
import xmltodict
from os.path import join
class APIRetrieve:
def __init__(self,startswith_url,options_dict,debug=False):
#self.options = "&".join(myList)
self.options = []
self.data = {}
options_list = []
for key in options_dict:
options_list.append("%s=%s" % (key,options_dict[key]))
options = "&".join(options_list)
self.full_url = ("%s&%s" % (startswith_url,options))
if (debug == True):
print ("%s" % self.full_url)
def APIRequestJSON(self):
req = urllib.request.Request(self,full_url)
try:
response = urllib.request.urlopen(req).read().decode("utf-8")
self.data=json.loads(response)
except HTTPError as e:
print(e.read().decode("utf-8"))
return self.data
def APIRequestXML(self):
req = urllib.request.Request(self.full_url)
try:
response = urllib.request.urlopen(req).read().decode("utf-8")
self.data=xmltodict.parse(response, process_namespaces=True)
except HTTPError as e:
print(e.read().decode("utf-8"))
return self.data
And there we go.
When requesting most data you pass some info in the HTTP "header" / URL. The above class assumes a couple of things:
1) you know how the URL should be formed
2) The URL is encoded with '&' as the delimiter
In every case in my recent history the class above works -- but always check the API documentation for the website you're pulling data from. I've written a lot of different socket and request classes in various languages over the years -- keep in mind this is the simplest utility class I could come up with. No matter the specific delimiters and options a URL will be formed as [base url][options]. A URL can be any kind of file-stream protocol request, technically. For example it could be a file on your computer, an HTTP GET from a web server or an FTP request.
Using the above class here are two examples:
Example 1
In this bit of code you can see that we're requesting some JSON from a website. Their API requires a base URL and some options. To save space I cheat a bit and add the "request" onto the end of the base url (e.g., "weather?"). I then create a dict that holds the "&" delimited options: the API key for the data account and some other stuff.
This example pulls some air quality data from Open Weather Map website.
def test_json_weather():
base_url = "http://api.openweathermap.org/data/2.5/weather?"
options = {
'zip' : '98230',
'APPID' : 'df58cfc63456b48a97f21a63f561b572'
}
api_weather = APIRetrieve(base_url,options)
current_weather = api_weather.APIRequestJSON()
print(current_weather)
Run the above code and you'll get some JSON output like:
{'dt': 1459982245, 'wind': {'speed': 5.7, 'deg': 250}, 'clouds': {'all': 75}, 'cod': 200, 'main': {'humidity': 71, 'temp_max': 292.93, 'temp': 289.04, 'temp_min': 286.15, 'pressure': 1027}, 'weather': [{'id': 803, 'icon': '04d', 'main': 'Clouds', 'description': 'broken clouds'}], 'id': 5787475, 'sys': {'sunset': 1459997516, 'type': 1, 'country': 'US', 'sunrise': 1459949691, 'id': 3245, 'message': 0.0083}, 'name': 'Blaine', 'coord': {'lon': -122.75, 'lat': 48.99}, 'base': 'cmc stations'}
Now, honestly, this is where I confess that I dislike XML. Not in some absolute way, just that it's so overblown and overly complicated for most basic needs. XML is overhead. JSON is a Python dictionary, enough said. Nevertheless I've used XML a lot. I've usually worked hard to strip down my XML needs to very well defined and pared down functionality. Managing XML and XSLT is like asking for trouble.
But, here we go. Almost the same exact method -- and in the Python class I actually use xmltodict to transform XML data into JSON/dict format. I'm using Python for it's usability so why wouldn't I? Well, you'll see in this example that the data doesn't come back quite the same. But this is because the website's API treats that data differently. Since we can get JSON I don't care.
Example 2
def test_xml_weather():
#&mode=xml
base_url = "http://api.openweathermap.org/data/2.5/weather?"
options = {
'zip' : '98230',
'APPID' : 'df58cfc63456b48a97f21a63f561b572',
'mode' : 'xml'
}
api_weather = APIRetrieve(base_url,options)
current_weather = api_weather.APIRequestXML()
print(current_weather)
And the output is:
OrderedDict([('current', OrderedDict([('city', OrderedDict([('@id', '5787475'), ('@name', 'Blaine'), ('coord', OrderedDict([('@lon', '-122.75'), ('@lat', '48.99')])), ('country', 'US'), ('sun', OrderedDict([('@rise', '2016-04-06T13:34:51'), ('@set', '2016-04-07T02:51:56')]))])), ('temperature', OrderedDict([('@value', '289.04'), ('@min', '286.15'), ('@max', '292.93'), ('@unit', 'kelvin')])), ('humidity', OrderedDict([('@value', '71'), ('@unit', '%')])), ('pressure', OrderedDict([('@value', '1027'), ('@unit', 'hPa')])), ('wind', OrderedDict([('speed', OrderedDict([('@value', '5.7'), ('@name', 'Moderate breeze')])), ('gusts', None), ('direction', OrderedDict([('@value', '250'), ('@code', 'WSW'), ('@name', 'West-southwest')]))])), ('clouds', OrderedDict([('@value', '75'), ('@name', 'broken clouds')])), ('visibility', None), ('precipitation', OrderedDict([('@mode', 'no')])), ('weather', OrderedDict([('@number', '803'), ('@value', 'broken clouds'), ('@icon', '04d')])), ('lastupdate', OrderedDict([('@value', '2016-04-06T22:37:25')]))]))])
Now, for those API cases where you have formats that go beyond
[base url][?|format={format}][options|&more_options] it's easy enough to format the URL per your specific API needs. You can request anything with a URL like images, video, whatnots. As long as you can pass that data to something that handles that MIME format you're in web business.
As usual, anyone desperately trying to make something work should do their best to understand what it is they're doing. Results can and will vary!
I've got a bunch of Python to share as I can fit it in. Next up I think it'll either be handling Juniper switches with Paramiko and Python or Windows Netlogon parsing.
Cheers.
#python #api #xml #json