• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar
PythonForBeginners.com

PythonForBeginners.com

Learn By Example

  • Home
  • Learn Python
    • Python Tutorial
  • Categories
    • Basics
    • Lists
    • Dictionary
    • Code Snippets
    • Comments
    • Modules
    • API
    • Beautiful Soup
    • Cheatsheet
    • Games
    • Loops
  • Python Courses
    • Python 3 For Beginners
You are here: Home / Beautiful Soup / BeautifulSoup Intro

BeautifulSoup Intro

Author: PFB Staff Writer
Last Updated: August 28, 2020

What is BeautifulSoup?


BeautifulSoup is a Python library from www.crummy.com 

What can it do


On their website they write "Beautiful Soup parses anything you give it, and does the tree traversal stuff for you. 

You can tell it to:

"Find all the links"

"Find all the links of class externalLink"

"Find all the links whose urls match "foo.com"

"Find the table heading that's got bold text, then give me that text."" 

BeautifulSoup Example


In this example, we will try and find a link (a tag) in a webpage. 

Before we start, we have to import two modules. (BeutifulSoup and urllib2). 

Urlib2 is used to open the URL we want. 

We will use the soup.findAll method to search through the soup object to match fortext and html tags within the page. 
from BeautifulSoup import BeautifulSoup
import urllib2

url = urllib2.urlopen("http://www.python.org")
content = url.read()
soup = BeautifulSoup(content)
links = soup.findAll("a")
Output

That will print out all the elements in python.org with an "a" tag. 

(The "a" tag defines a hyperlink, which is used to link from one page to another.)

BeautifulSoup Example 2


To make it a bit more useful, we can specify the URL's we want to return.  
from BeautifulSoup import BeautifulSoup
import urllib2
import re

url = urllib2.urlopen("http://www.python.org")
content = url.read()
soup = BeautifulSoup(content)
for a in soup.findAll('a',href=True):
    if re.findall('python', a['href']):
        print "Found the URL:", a['href']
Further Reading

I recommend that you head over to http://www.crummy.com to read more about what you can do with this awesome module.

Related

Recommended Python Training

Course: Python 3 For Beginners

Over 15 hours of video content with guided instruction for beginners. Learn how to create real world applications and master the basics.

Enroll Now

Filed Under: Beautiful Soup, crawler Author: PFB Staff Writer

More Python Topics

API Argv Basics Beautiful Soup Cheatsheet Code Code Snippets Command Line Comments Concatenation crawler Data Structures Data Types deque Development Dictionary Dictionary Data Structure In Python Error Handling Exceptions Filehandling Files Functions Games GUI Json Lists Loops Mechanzie Modules Modules In Python Mysql OS pip Pyspark Python Python On The Web Python Strings Queue Requests Scraping Scripts Split Strings System & OS urllib2

Primary Sidebar

Menu

  • Basics
  • Cheatsheet
  • Code Snippets
  • Development
  • Dictionary
  • Error Handling
  • Lists
  • Loops
  • Modules
  • Scripts
  • Strings
  • System & OS
  • Web

Get Our Free Guide To Learning Python

Most Popular Content

  • Reading and Writing Files in Python
  • Python Dictionary – How To Create Dictionaries In Python
  • How to use Split in Python
  • Python String Concatenation and Formatting
  • List Comprehension in Python
  • How to Use sys.argv in Python?
  • How to use comments in Python
  • Try and Except in Python

Recent Posts

  • Count Rows With Null Values in PySpark
  • PySpark OrderBy One or Multiple Columns
  • Select Rows with Null values in PySpark
  • PySpark Count Distinct Values in One or Multiple Columns
  • PySpark Filter Rows in a DataFrame by Condition

Copyright © 2012–2025 · PythonForBeginners.com

  • Home
  • Contact Us
  • Privacy Policy
  • Write For Us