• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar
PythonForBeginners.com

PythonForBeginners.com

Learn By Example

  • Home
  • Learn Python
    • Python Tutorial
  • Categories
    • Basics
    • Lists
    • Dictionary
    • Code Snippets
    • Comments
    • Modules
    • API
    • Beautiful Soup
    • Cheatsheet
    • Games
    • Loops
  • Python Courses
    • Python 3 For Beginners
You are here: Home / Feedparser / Using Feedparser in Python

Using Feedparser in Python

Author: PFB Staff Writer
Last Updated: August 28, 2020

Overview

In this post we will take a look on how we can download and parse syndicated
feeds with Python.

The Python module we will use for that is “Feedparser”.

The complete documentation can be found here.

What is RSS?

RSS stands for Rich Site Summary and uses standard web feed formats to publish
frequently updated information: blog entries, news headlines, audio, video.

An RSS document (called “feed”, “web feed”, or “channel”) includes full or
summarized text, and metadata, like publishing date and author’s name. [source]

What is Feedparser?

Feedparser is a Python library that parses feeds in all known formats, including
Atom, RSS, and RDF. It runs on Python 2.4 all the way up to 3.3. [source]

RSS Elements

Before we install the feedparser module and start to code, let’s take a look
at some of the available RSS elements.

The most commonly used elements in RSS feeds are “title”, “link”, “description”,
“publication date”, and “entry ID”.

The less commonnly used elements are “image”, “categories”, “enclosures”
and “cloud”.

Install Feedparser

To install feedparser on your computer, open your terminal and install it using
“pip” (A tool for installing and managing Python packages)

sudo pip install feedparser

To verify that feedparser is installed, you can run a “pip list”.

You can of course also enter the interactive mode, and import the feedparser
module there.

If you see an output like below, you can be sure it’s installed.

>>> import feedparser
>>>

Now that we have installed the feedparser module, we can go ahead and begin
to work with it.

Getting the RSS feed

You can use any RSS feed that you want. Since I like to read Reddit, I will use
that for my example.

Reddit is made up of many sub-reddits, the one I am particular interested in for
now is the “Python” sub-reddit.

The way to get the RSS feed, is just to look up the URL to that sub-reddit and
add a “.rss” to it.

The RSS feed that we need for the python sub-reddit would be:
http://www.reddit.com/r/python/.rss

Using Feedparser

You start your program with importing the feedparser module.

import feedparser

Create the feed. Put in the RSS feed that you want.

d = feedparser.parse('http://www.reddit.com/r/python/.rss')

The channel elements are available in d.feed (Remember the “RSS Elements” above)

The items are available in d.entries, which is a list.

You access items in the list in the same order in which they appear in the
original feed, so the first item is available in d.entries[0].

Print the title of the feed

print d['feed']['title']

>>> Python

Resolves relative links

print d['feed']['link']

>>> http://www.reddit.com/r/Python/

Parse escaped HTML

print d.feed.subtitle

>>> news about the dynamic, interpreted, interactive, object-oriented, extensible
programming language Python

See number of entries

print len(d['entries'])

>>> 25

Each entry in the feed is a dictionary. Use [0] to print the first entry.

print d['entries'][0]['title'] 

>>> Functional Python made easy with a new library: Funcy

Print the first entry and its link

print d.entries[0]['link'] 

>>> http://www.reddit.com/r/Python/comments/1oej74/functional_python_made_easy_with_a_new_
library/

Use a for loop to print all posts and their links.

for post in d.entries:
    print post.title + ": " + post.link + "
"

>>>
Functional Python made easy with a new library: Funcy: http://www.reddit.com/r/Python/
comments/1oej74/functional_python_made_easy_with_a_new_
library/

Python Packages Open Sourced: http://www.reddit.com/r/Python/comments/1od7nn/
python_packages_open_sourced/

PyEDA 0.15.0 Released: http://www.reddit.com/r/Python/comments/1oet5m/
pyeda_0150_released/

PyMongo 2.6.3 Released: http://www.reddit.com/r/Python/comments/1ocryg/
pymongo_263_released/
.....
.......
........

Reports the feed type and version

print d.version      

>>> rss20

Full access to all HTTP headers

print d.headers          	

>>> 
{'content-length': '5393', 'content-encoding': 'gzip', 'vary': 'accept-encoding', 'server':
"'; DROP TABLE servertypes; --", 'connection': 'close', 'date': 'Mon, 14 Oct 2013 09:13:34
GMT', 'content-type': 'text/xml; charset=UTF-8'}

Just get the content-type from the header

print d.headers.get('content-type')

>>> text/xml; charset=UTF-8

Using the feedparser is an easy and fun way to parse RSS feeds.

Sources

http://www.slideshare.net/LindseySmith1/feedparser
http://code.google.com/p/feedparser/

Related

Recommended Python Training

Course: Python 3 For Beginners

Over 15 hours of video content with guided instruction for beginners. Learn how to create real world applications and master the basics.

Enroll Now

Filed Under: Feedparser, Modules, Python On The Web, scrapers Author: PFB Staff Writer

More Python Topics

API Argv Basics Beautiful Soup Cheatsheet Code Code Snippets Command Line Comments Concatenation crawler Data Structures Data Types deque Development Dictionary Dictionary Data Structure In Python Error Handling Exceptions Filehandling Files Functions Games GUI Json Lists Loops Mechanzie Modules Modules In Python Mysql OS pip Pyspark Python Python On The Web Python Strings Queue Requests Scraping Scripts Split Strings System & OS urllib2

Primary Sidebar

Menu

  • Basics
  • Cheatsheet
  • Code Snippets
  • Development
  • Dictionary
  • Error Handling
  • Lists
  • Loops
  • Modules
  • Scripts
  • Strings
  • System & OS
  • Web

Get Our Free Guide To Learning Python

Most Popular Content

  • Reading and Writing Files in Python
  • Python Dictionary – How To Create Dictionaries In Python
  • How to use Split in Python
  • Python String Concatenation and Formatting
  • List Comprehension in Python
  • How to Use sys.argv in Python?
  • How to use comments in Python
  • Try and Except in Python

Recent Posts

  • Count Rows With Null Values in PySpark
  • PySpark OrderBy One or Multiple Columns
  • Select Rows with Null values in PySpark
  • PySpark Count Distinct Values in One or Multiple Columns
  • PySpark Filter Rows in a DataFrame by Condition

Copyright © 2012–2025 · PythonForBeginners.com

  • Home
  • Contact Us
  • Privacy Policy
  • Write For Us