Jul. 22, 2013

How to use Reddit API in Python

Reddit API - Overview

In an earlier post "How to access various Web Services in Python", we described
how we can access services such as YouTube, Vimeo and Twitter via their API's. 

Note, there are a few Reddit Wrappers that you can use to interact with Reddit. 

A wrapper is an API client, that are commonly used to wrap the API into easy to
use functions by doing the API calls itself. 

That results in that the user of it can be less concerned with how the code
actually works. 

If you don't use a wrapper, you will have to access the Reddits API directly,
which is exactly what we will do in this post.

Getting started

Since we are going to focus on the API from Reddit, let's head over to their API
documentation. I recommend that you get familiar with the documentation and also
pay extra attention to the the overview and the sections about "modhashes",
"fullnames" and "type prefixes".

The result from the API will return as either XML or JSON. In this post we will
use the JSON format.

Please refer to above post or the official documentation for more information
about the JSON structure.

API documentation

In the API documentation, you can see there are tons of things to do.

In this post, we have chosen to extract information from our own Reddit account. 

The information we need for that is: GET /user/username/where[ .json | .xml ]
GET /user/username/where[ .json | .xml ]

    ? /user/username/overview
    ? /user/username/submitted
    ? /user/username/comments
    ? /user/username/liked
    ? /user/username/disliked
    ? /user/username/hidden
    ? /user/username/saved

Viewing the JSON output

If we for example want to use "comments", the URL would be:

You can see that we have replaced "username" and "where" with our own input. 
To see the data response, you can either make a curl request, like this:
curl http://www.reddit.com/user/spilcm/comments/.json
...or just paste the URL into your browser.
You can see that the response is JSON. This may be difficult to look at in the
browser, unless you have the JSONView plugin installed.

These extensions are available for Firefox and Chrome. 

Start coding

Now that we have the URL, let's start to do some coding.

Open up your favourite IDLE / Editor and import the modules that we will need.
Importing the modules. The pprint and json modules are optional. 
from pprint import pprint

import requests

import json

Make The API Call

Now its time to make the API call to Reddit.
r = requests.get(r'http://www.reddit.com/user/spilcm/comments/.json')
Now, we have a Response object called "r". We can get all the information we need
from this object.

JSON Response Content

The Requests module comes with a builtin JSON decoder, which we can use for with
the JSON data.
As you could see on the image above, the output that we get is not really what we
want to display. 

The question is, how do we extract useful data from it?

If we just want to look at the keys in the "r" object:
r = requests.get(r'http://www.reddit.com/user/spilcm/comments/.json')

data = r.json()

print data.keys()
That should give us the following output:

[u'kind', u'data']

These keys are very important to us. 
Now its time to get the data that we are interested in.

Get the JSON feed and copy/paste the output into a JSON editor to get an easier
overview over the data.

An easy way of doing that is to paste JSON result into an online JSON editor. 

I use http://jsoneditoronline.org/ but any JSON editor should do the work. 
Let's see an example of this:
r = requests.get(r'http://www.reddit.com/user/spilcm/comments/.json')
The output can be seen in the image below.
As you can see from the image, we get the same keys (kind, data) as we did before
when we printed the keys.

Convert JSON into a dictionary

Let's convert the JSON data into Python dictionary.

You can do that like this:


Now when we have a Python dictionary, we start using it to get the the results
we want. 

Navigate to find useful data

Just navigate your way down until you find what you're after.
r = requests.get(r'http://www.reddit.com/user/spilcm/comments/.json')


data = r.json()

print data['data']['children'][0]
The result is stored in the variable "data".  

To access our JSON data, we simple use the bracket notation, like this:

Remember that an array is indexed from zero.

Instead of printing each and every entry, we can use a for loop to iterate
through our dictionary.
for child in data['data']['children']:

    print child['data']['id'], "
", child['data']['author'],child['data']['body']

We can access anything we want like this, just look up what data you are
interested in.

The complete script

As you can see in our complete script, we only have to import one module: 

import requests

r = requests.get(r'http://www.reddit.com/user/spilcm/comments/.json')


data = r.json()

for child in data['data']['children']:
    print child['data']['id'], "
", child['data']['author'],child['data']['body']
When you run the script, you should see something similar to this:
More Reading

Recommended Python Training – DataCamp

For Python training, our top recommendation is DataCamp.

Datacamp provides online interactive courses that combine interactive coding challenges with videos from top instructors in the field.

Datacamp has beginner to advanced Python training that programmers of all levels benefit from.


Read more about:
Disclosure of Material Connection: Some of the links in the post above are “affiliate links.” This means if you click on the link and purchase the item, I will receive an affiliate commission. Regardless, PythonForBeginners.com only recommend products or services that we try personally and believe will add value to our readers.