Jan. 04, 2013

OS.Walk and Fnmatch in Python


In an earlier post "OS.walk in Python", I described how to use os.walk and showed some examples on how to use it in scripts. 

In this article, I will show how to use the os.walk() module function to walk a
directory tree, and the fnmatch module for matching file names.

What is OS.walk?

It generates the file names in a directory tree by walking the tree either
top-down or bottom-up.

For each directory in the tree rooted at directory top (including top itself), 
it yields a 3-tuple (dirpath, dirnames, filenames).
dirpath		# is a string, the path to the directory. 

dirnames	# is a list of the names of the subdirectories in dirpath
		  (excluding '.' and '..'). 

filenames	# is a list of the names of the non-directory files in dirpath. 

Note that the names in the lists contain no path components. 
To get a full path (which begins with top) to a file or directory in dirpath,
do os.path.join(dirpath, name).

For more information, please see the Python Docs.

What is Fnmatch

The fnmatch module compares file names against glob-style patterns such as used
by Unix shells.

These are not the same as the more sophisticated regular expression rules. 

It's purely a string matching operation. 

If you find it more convenient to use a different pattern style, for example 
regular expressions, then simply use regex operations to match your filenames.


What does it do?

The fnmatch module is used for the wild-card pattern matching.

Simple Matching
fnmatch() compares a single file name against a pattern and returns a boolean
indicating whether or not they match. 

The comparison is case-sensitive when the operating system uses a case-sensitive
file system.

To test a sequence of filenames, you can use filter(). 

It returns a list of the names that match the pattern argument.

Find all mp3 files

This script will search for *.mp3 files from the rootPath ("/")
import fnmatch
import os
rootPath = '/'
pattern = '*.mp3'
for root, dirs, files in os.walk(rootPath):
    for filename in fnmatch.filter(files, pattern):
        print( os.path.join(root, filename))

Search computer for specific files

This script uses 'os.walk' and 'fnmatch' with filters to search the hard-drive
for all image files

import fnmatch
import os

images = ['*.jpg', '*.jpeg', '*.png', '*.tif', '*.tiff']
matches = []

for root, dirnames, filenames in os.walk("C:\"):
    for extensions in images:
        for filename in fnmatch.filter(filenames, extensions):
            matches.append(os.path.join(root, filename))
There are many other (and faster) ways to do this, but now you understand the
basics of it.

More Reading


Stackoverflow Match Pattern

Stackoverflow oswalk with fnmatch

Recommended Python Training – Treehouse

Treehouse For Python training, our top recommendation is Treehouse.

Treehouse is an online training service that teaches web design, web development and app development with videos, quizzes and interactive coding exercises.

Treehouse has beginner to advanced Python training that programmers of all levels benefit from.

Read more about:
Disclosure of Material Connection: Some of the links in the post above are “affiliate links.” This means if you click on the link and purchase the item, I will receive an affiliate commission. Regardless, PythonForBeginners.com only recommend products or services that we try personally and believe will add value to our readers.