• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar
PythonForBeginners.com

PythonForBeginners.com

Learn By Example

  • Home
  • Learn Python
    • Python Tutorial
  • Categories
    • Basics
    • Lists
    • Dictionary
    • Code Snippets
    • Comments
    • Modules
    • API
    • Beautiful Soup
    • Cheatsheet
    • Games
    • Loops
  • Python Courses
    • Python 3 For Beginners
You are here: Home / Files / How to Extract a Date from a .txt File in Python

How to Extract a Date from a .txt File in Python

Author: Josh Petty
Last Updated: August 31, 2021

In this tutorial, we’ll examine the different ways you can extract a date from a .txt file using Python programming. Python is a versatile language—as you’ll discover—and there are many solutions for this problem.

First, we’ll look at using regular expression patterns to search text files for dates that fit a predefined format. We’ll learn about using the re library and creating our own regular expression searches.

We’ll also examine datetime objects and use them to convert strings into data models. Lastly, we’ll see how the datefinder module simplifies the process of searching a text file for dates that haven’t been formatted, like we might find in natural language content.

Extract a Date from a .txt File using Regular Expression

Dates are written in many different formats. Sometimes people write month/day/year. Other dates might include times of the day, or the day of the week (Wednesday July 8, 2021 8:00PM).

How dates are formatted is a factor to consider before we go about extracting them from text files. 

For instance, if a date follows the month/date/year format, we can find it using a regular expression pattern. With regular expression, or regex for short, we can search a text by matching a string to a predefined pattern. 

The beauty of regular expression is that we can use special characters to create powerful search patterns. For instance, we can craft a pattern that will find all the formatted dates in the following body of text.

minutes.txt
10/14/2021 – Meeting with the client.
07/01/2021 – Discussed marketing strategies.
12/23/2021 – Interviewed a new team lead.
01/28/2018 – Changed domain providers.
06/11/2017 – Discussed moving to a new office.

Example: Finding formatted dates with regex

import re

# open the text file and read the data
file = open("minutes.txt",'r')

text = file.read()
# match a regex pattern for formatted dates
matches = re.findall(r'(\d+/\d+/\d+)',text)

print(matches)

Output

[’10/14/2021′, ’07/01/2021′, ’12/23/2021′, ’01/28/2018′, ’06/11/2017′]

The regex pattern here uses special characters to define the strings we want to extract from the text file. The characters d and + tell regex we’re looking for multiple digits within the text.

We can also use regex to find dates that are formatted in different ways. By altering our regex pattern, we can find dates that use either a forward slash (\) or a dash (–) as the separator.

This works because regex allows for optional characters in the search pattern. We can specify that either character—a forward slash or dash—is an acceptable match.

apple2.txt
The first Apple II was sold on 07-10-1977. The last of the Apple II
models were discontinued on 10/15/1994.

Example: Matching dates with a regex pattern

import re

# open a text file
f = open("apple2.txt", 'r')

# extract the file's content
content = f.read()

# a regular expression pattern to match dates
pattern = "\d{2}[/-]\d{2}[/-]\d{4}"

# find all the strings that match the pattern
dates = re.findall(pattern, content)

for date in dates:
    print(date)

f.close()

Output

07-10-1977
10/15/1994

Examining the full extent of regex’s potential is beyond the scope of this tutorial. Try experimenting with some of the following special characters to learn more about using regular expression patterns to extract a date—or other information—from a .txt file.

Special Characters in Regex

  • \s – A space character
  • \S – Any character except for a space character
  • \d – Any digit from 0 to 9
  • \D – And any character except for a digit
  • \w – Any word of characters or digits [a-zA-Z0-9]
  • \W – Any non-word characters

Extract a Datetime Object from a .txt File

In Python we can use the datetime library for manipulating dates and working with time. The datetime library comes pre-packed with Python, so there’s no need to install it.

By using datetime objects, we have more control over string data read from text files. For example, we can use a datetime object to get a copy of the current date and time of our computer.

import datetime

now = datetime.datetime.now()
print(now)

Output

2021-07-04 20:15:49.185380

In the following example, we’ll extract a date from a company .txt file that mentions a scheduled meeting. Our employer needs us to scan a group of such documents for dates. Later, we plan to add the information we gather to a SQLite database.

We’ll begin by defining a regex pattern that will match our date format. Once a match is found, we’ll use it to create a datetime object from the string data.

schedule.txt

schedule.txt
The project begins next month. Denise has scheduled a meeting in the conference room at the Embassy Suits on 10-7-2021.

Example: Creating datetime objects from file data

import re
from datetime import datetime

# open the data file
file = open("schedule.txt", 'r')
text = file.read()

match = re.search(r'\d+-\d+-\d{4}', text)
# create a new datetime object from the regex match
date = datetime.strptime(match.group(), '%d-%m-%Y').date()
print(f"The date of the meeting is on {date}.")
file.close()

Output

The date of the meeting is on 2021-07-10.

Extracting Dates from a Text File with the Datefinder Module

The Python datefinder module can locate dates in a body of text. Using the find_dates() method, it’s possible to search text data for many different types of dates. Datefinder will return any dates it finds in the form of a datetime object.

Unlike the other packages we’ve discussed in this guide, Python does not come with datefinder. The easiest way to install the datefinder module is to use pip from the command prompt.

pip install datefinder

With datefinder installed, we’re ready to open files and extract data. For this example, we’ll use a text document that introduces a fictitious company project. Using datefinder, we’ll extract each date from the .txt file, and print their datimeobject counterparts.

Feel free to save the file locally and follow along.

project_timeline.txt
PROJECT PEPPER

All team members must read the project summary by
January 4th, 2021.

The first meeting of PROJECT PEPPER begins on 01/15/2021

at 9:00am. Please find the time to read the following links by then.
created on 08-12-2021 at 05:00 PM

This project file has dates in many formats. Dates are written using dashes and forward slashes. What’s worse, the month January is written out. How can we find all these dates with Python?

Example: Using datefinder to extract dates from file data

import datefinder

# open the project schedule
file = open("project_timeline.txt",'r')

content = file.read()

# datefinder will find the dates for us
matches = list(datefinder.find_dates(content))

if len(matches) > 0:
    for date in matches:
        print(date)
else:
    print("Found no dates.")

file.close()

Output
2021-01-04 00:00:00
2021-01-15 09:00:00
2021-08-12 17:00:00

As you can see from the output, datefinder is able to find a variety of date formats in the text. Not only is the package capable of recognizing the names of months, but it also recognizes the time of day if it’s included in the text.

In another example, we’ll use the datefinder package to extract a date from a .txt file that includes the dates for a popular singer’s upcoming tour.

tour_dates.txt
Saturday July 25, 2021 at 07:00 PM     Inglewood, CA
Sunday July 26, 2021 at 7 PM     Inglewood, CA
09/30/2021 7:30PM  Foxbourough, MA

Example: Extract a tour date and times from a .txt file with datefinder

import datefinder

# open the project schedule
file = open("tour_dates.txt",'r')

content = file.read()

# datefinder will find the dates for us
matches = list(datefinder.find_dates(content))

if len(matches) > 0:
    print("TOUR DATES AND TIMES")
    print("--------------------")
    for date in matches:
        # use f string to format the text
        print(f"{date.date()}     {date.time()}")
else:
    print("Found no dates.")
file.close()

Output

TOUR DATES AND TIMES
——————–
2021-07-25     19:00:00
2021-07-26     19:00:00
2021-09-30     19:30:00

As you can see from the examples, datefinder can find many different types of dates and times. This is useful if the dates you’re looking for don’t have a certain format, as will often be the case in natural language data.

Summary

In this post, we’ve covered several methods of how to extract a date or time from a .txt file. We’ve seen the power of regular expression to find matches in string data, and we’ve seen how to convert that data into a Python datetime object.

Finally, if the dates in your text files don’t have a specified format—as will be the case in most files with natural language content—try the datefinder module. With this Python package, it’s possible to extract dates and times from a text file that aren’t conveniently formatted ahead of time.

Related Posts

If you enjoyed this tutorial and are eager to learn more about Python—and we sincerely hope you are—follow these links for more great guides from Python for Beginners.

  • How to use Python concatenation to join strings
  • Using Python try catch to mitigate errors and prevent crashes

Related

Recommended Python Training

Course: Python 3 For Beginners

Over 15 hours of video content with guided instruction for beginners. Learn how to create real world applications and master the basics.

Enroll Now

Filed Under: Files Author: Josh Petty

More Python Topics

API Argv Basics Beautiful Soup Cheatsheet Code Code Snippets Command Line Comments Concatenation crawler Data Structures Data Types deque Development Dictionary Dictionary Data Structure In Python Error Handling Exceptions Filehandling Files Functions Games GUI Json Lists Loops Mechanzie Modules Modules In Python Mysql OS pip Pyspark Python Python On The Web Python Strings Queue Requests Scraping Scripts Split Strings System & OS urllib2

Primary Sidebar

Menu

  • Basics
  • Cheatsheet
  • Code Snippets
  • Development
  • Dictionary
  • Error Handling
  • Lists
  • Loops
  • Modules
  • Scripts
  • Strings
  • System & OS
  • Web

Get Our Free Guide To Learning Python

Most Popular Content

  • Reading and Writing Files in Python
  • Python Dictionary – How To Create Dictionaries In Python
  • How to use Split in Python
  • Python String Concatenation and Formatting
  • List Comprehension in Python
  • How to Use sys.argv in Python?
  • How to use comments in Python
  • Try and Except in Python

Recent Posts

  • Count Rows With Null Values in PySpark
  • PySpark OrderBy One or Multiple Columns
  • Select Rows with Null values in PySpark
  • PySpark Count Distinct Values in One or Multiple Columns
  • PySpark Filter Rows in a DataFrame by Condition

Copyright © 2012–2025 · PythonForBeginners.com

  • Home
  • Contact Us
  • Privacy Policy
  • Write For Us