• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar
PythonForBeginners.com

PythonForBeginners.com

Learn By Example

  • Home
  • Learn Python
    • Python Tutorial
  • Categories
    • Basics
    • Lists
    • Dictionary
    • Code Snippets
    • Comments
    • Modules
    • API
    • Beautiful Soup
    • Cheatsheet
    • Games
    • Loops
  • Python Courses
    • Python 3 For Beginners
You are here: Home / Basics / Pandas Read Excel into DataFrame

Pandas Read Excel into DataFrame

Author: Aditya Raj
Last Updated: April 17, 2023

Excel Files are used in the Windows file system to store tabular data. Unlike a CSV file that contains a single sheet, excel files can store multiple sheets. In this article, we will discuss different ways to read multiple sheets from an Excel file into a pandas dataframe.

Table of Contents
  1. Read Excel File Using The read_excel()Method
  2. Using the  ExcelFile Class and read_excel() Method
  3. Using the  ExcelFile Class and The parse() Method
  4. Pandas Read Multiple Excel Sheets By Name
  5. Pandas Read Multiple Excel Sheets By Position
  6. Pandas Read Multiple Excel Sheets into a Dictionary
  7. Read Multiple Excel Sheets into a Single Pandas DataFrame
  8. Conclusion

Read Excel File Using The read_excel()Method

To read an Excel file using the read_excel() method defined in the pandas module, we will simply pass the file name to the read_excel() method. After execution, the read_excel() method returns the first sheet of the Excel file as a dataframe.

For the examples in this article, we will use the following Excel file.

sample_excel_fileDownload

You can read the above Excel file using the read_excel() method in the pandas module as shown below.

import pandas as pd
filename='sample_excel_file.xlsx'
df = pd.read_excel(filename)
print("The dataframe is:")
print(df)

Output:

The dataframe is:
   Student  Physics  Chemistry  Biology
0     Wade       92         76       73
1     Dave       95         96       71
2     Ivan       65         62       95
3    Riley       68         92       66
4  Gilbert       74         95       76
5    Jorge       99         79       91
6      Dan       72         94       65

The Excel file in the above example contains three sheets. However, when we read an Excel file using the read_csv() method, it returns the first sheet of the Excel file as pandas dataframe.

Using the  ExcelFile Class and read_excel() Method

The ExcelFile class is a great tool for reading multiple sheets from an Excel file in Python. The ExcelFile() constructor takes the filename of the Excel file as its input argument and returns an ExcelFile object. We can use this ExcelFile object to read multiple sheets into pandas dataframes. For instance, you can read the name of all the sheets of an Excel file using the ExcelFile class as shown below.

import pandas as pd
xls = pd.ExcelFile('sample_excel_file.xlsx')
sheet_names=xls.sheet_names
print("The sheet names in the excel file are:")
print(sheet_names)

Output:

The sheet names in the excel file are:
['class_1', 'class_2', 'class_3']

In the above example, we first created an ExcelFile object using the ExcelFile() function. Then, we used the sheet_names attribute to get the list of sheet names in the input Excel file.

Once we get the ExcelFile object, we can use it to read a sheet from the Excel file. For this, we will first get the name of all the sheets using the sheet_names attribute of the ExcelFile object. Then, we will pass the ExcelFile object as the first input argument and the name of the required sheet as the second input argument to the read_excel() method. After execution of the read_excel() method, you will get the sheet into the dataframe as shown below.

import pandas as pd
xls = pd.ExcelFile('sample_excel_file.xlsx')
sheet=xls.sheet_names
print("The sheet names in the excel file are:")
print(sheet)
df = pd.read_excel(xls, 'class_1')
print("The dataframe is:")
print(df)

Output:

The sheet names in the excel file are:
['class_1', 'class_2', 'class_3']
The dataframe is:
   Student  Physics  Chemistry  Biology
0     Wade       92         76       73
1     Dave       95         96       71
2     Ivan       65         62       95
3    Riley       68         92       66
4  Gilbert       74         95       76
5    Jorge       99         79       91
6      Dan       72         94       65

In this example, we passed the name of the Excel File as the first input argument and the name of the sheet as the second input argument to the read_excel() method. After execution, it returns the specified sheet as a dataframe.

Using the  ExcelFile Class and The parse() Method

If you don’t know the names of the sheets in the Excel file, you can use the parse() method to read the sheet into a pandas dataframe. The parse() method, when invoked on the ExcelFile object, takes the position of the sheet in the Excel file. After execution, it returns the data frame as shown below.

import pandas as pd
xls = pd.ExcelFile('sample_excel_file.xlsx')
df = xls.parse(0)
print("The first dataframe is excel file is:")
print(df)

Output:

The first dataframe is excel file is:
   Student  Physics  Chemistry  Biology
0     Wade       92         76       73
1     Dave       95         96       71
2     Ivan       65         62       95
3    Riley       68         92       66
4  Gilbert       74         95       76
5    Jorge       99         79       91
6      Dan       72         94       65

In this example, we have passed the index 0 to the parse() method. Hence, it returned the first sheet of the Excel file in the dataframe. To read other sheets, you can pass their respective position to the parse() method as the input argument.

Pandas Read Multiple Excel Sheets By Name

To read multiple Excel sheets by name into pandas dataframes, we will first create an ExcelFile object using the ExcelFile() function. Then, we will use the sheet_names attribute of the ExcelFile object to get a list of all the sheet names in the Excel file. Finally, we will use the read_excel() method to read the sheets into different dataframes as shown in the following example.

import pandas as pd
xls = pd.ExcelFile('sample_excel_file.xlsx')
sheet=xls.sheet_names
print("The sheet names in the excel file are:")
print(sheet)
df1 = pd.read_excel(xls, 'class_1')
df2 = pd.read_excel(xls, 'class_2')
df3 = pd.read_excel(xls, 'class_3')
print("The first dataframe is:")
print(df1)
print("The second dataframe is:")
print(df2)
print("The third dataframe is:")
print(df3)

Output:

The sheet names in the excel file are:
['class_1', 'class_2', 'class_3']
The first dataframe is:
   Student  Physics  Chemistry  Biology
0     Wade       92         76       73
1     Dave       95         96       71
2     Ivan       65         62       95
3    Riley       68         92       66
4  Gilbert       74         95       76
5    Jorge       99         79       91
6      Dan       72         94       65
The second dataframe is:
     Student  Physics  Chemistry  Math
0      Harry       68         92    69
1        Dan       74         95    96
2     Joshua       99         79    77
3   Jennifer       72         94    61
4       Lisa       98         99    93
5   Patricia       93         67    78
6  Elizabeth       99         76    78
The third dataframe is:
  Student  Physics  Math  Biology
0  Aditya       92    95       73
1   Chris       95    79       71
2     Sam       65    75       95
3   Harry       68    69       66
4    Golu       74    96       76
5    Joel       99    77       91
6     Tom       72    61       65
7   Harsh       98    93       95
8   Clara       93    78       79
9    Tina       99    78       94

Instead of reading the sheets into different variables, we can read multiple sheets of an Excel file into a list of dataframes.

For this, we will first create an empty list to store the output dataframes. Then,  we will iterate through the sheet names using the for loop. While iterating, we will read each sheet using the read_excel() method and add it to the list using the append() method. After execution of the for loop, we will get all the dataframes created from the Excel file in a list.

You can observe this in the following example.

import pandas as pd
xls = pd.ExcelFile('sample_excel_file.xlsx')
sheet=xls.sheet_names
print("The sheet names in the excel file are:")
print(sheet)
df_list=[]
for name in sheet:
    df=pd.read_excel(xls, name)
    df_list.append(df)
print("The list of dataframes is:")
print(df_list)

Output:

The sheet names in the excel file are:
['class_1', 'class_2', 'class_3']
The list of dataframes is:
[   Student  Physics  Chemistry  Biology
0     Wade       92         76       73
1     Dave       95         96       71
2     Ivan       65         62       95
3    Riley       68         92       66
4  Gilbert       74         95       76
5    Jorge       99         79       91
6      Dan       72         94       65,      
Student  Physics  Chemistry  Math
0      Harry       68         92    69
1        Dan       74         95    96
2     Joshua       99         79    77
3   Jennifer       72         94    61
4       Lisa       98         99    93
5   Patricia       93         67    78
6  Elizabeth       99         76    78,   
Student  Physics  Math  Biology
0  Aditya       92    95       73
1   Chris       95    79       71
2     Sam       65    75       95
3   Harry       68    69       66
4    Golu       74    96       76
5    Joel       99    77       91
6     Tom       72    61       65
7   Harsh       98    93       95
8   Clara       93    78       79
9    Tina       99    78       94]

Pandas Read Multiple Excel Sheets By Position

To read multiple Excel sheets into pandas dataframes by position, we will use the parse() method and the ExcelFile() function. For this, we will use the following steps.

  • First, we will create an ExcelFile object using the ExcelFile() function. The ExcelFile() function takes the name of the Excel file as its input argument and returns an ExcelFile object. 
  • Next, we find the number of sheets in the Excel file. For this, we will first get the list of sheet names using the sheet_names attribute of the ExcelFile object. We will also get the length of the list containing sheet names using the len() function.
  • Once we get the number of sheets in the Excel file, we will use the parse() method to read all the sheets into the pandas dataframes. The parse() method, when invoked on an ExcelFile object, takes the position of the sheet as its input argument and returns the data into a dataframe.

You can observe the steps discussed above in the following example.

import pandas as pd
xls = pd.ExcelFile('sample_excel_file.xlsx')
sheet=xls.sheet_names
print("The sheet names in the excel file are:")
print(sheet)
number_of_sheets=len(sheet)
for position in range(number_of_sheets):
    df=xls.parse(position)
    print("The dataframe at position {} is:".format(position))
    print(df)

Output:

The sheet names in the excel file are:
['class_1', 'class_2', 'class_3']
The dataframe at position 0 is:
   Student  Physics  Chemistry  Biology
0     Wade       92         76       73
1     Dave       95         96       71
2     Ivan       65         62       95
3    Riley       68         92       66
4  Gilbert       74         95       76
5    Jorge       99         79       91
6      Dan       72         94       65
The dataframe at position 1 is:
     Student  Physics  Chemistry  Math
0      Harry       68         92    69
1        Dan       74         95    96
2     Joshua       99         79    77
3   Jennifer       72         94    61
4       Lisa       98         99    93
5   Patricia       93         67    78
6  Elizabeth       99         76    78
The dataframe at position 2 is:
  Student  Physics  Math  Biology
0  Aditya       92    95       73
1   Chris       95    79       71
2     Sam       65    75       95
3   Harry       68    69       66
4    Golu       74    96       76
5    Joel       99    77       91
6     Tom       72    61       65
7   Harsh       98    93       95
8   Clara       93    78       79
9    Tina       99    78       94

In the above example, we printed the dataframes after reading them using a for loop and the parse() method. Instead of printing the dataframes, you can also make a list of dataframes of all the sheets as shown below.

import pandas as pd
xls = pd.ExcelFile('sample_excel_file.xlsx')
sheet=xls.sheet_names
print("The sheet names in the excel file are:")
print(sheet)
number_of_sheets=len(sheet)
df_list=[]
for position in range(number_of_sheets):
    df=xls.parse(position)
    df_list.append(df)
print("The list of dataframes is:")
print(df_list)

Output:

The sheet names in the excel file are:
['class_1', 'class_2', 'class_3']
The list of dataframes is:
[   Student  Physics  Chemistry  Biology
0     Wade       92         76       73
1     Dave       95         96       71
2     Ivan       65         62       95
3    Riley       68         92       66
4  Gilbert       74         95       76
5    Jorge       99         79       91
6      Dan       72         94       65,
      Student  Physics  Chemistry  Math
0      Harry       68         92    69
1        Dan       74         95    96
2     Joshua       99         79    77
3   Jennifer       72         94    61
4       Lisa       98         99    93
5   Patricia       93         67    78
6  Elizabeth       99         76    78,
   Student  Physics  Math  Biology
0  Aditya       92    95       73
1   Chris       95    79       71
2     Sam       65    75       95
3   Harry       68    69       66
4    Golu       74    96       76
5    Joel       99    77       91
6     Tom       72    61       65
7   Harsh       98    93       95
8   Clara       93    78       79
9    Tina       99    78       94]

In this example, we created a list to store the output list of dataframes. Inside the for loop, we have read the sheets into dataframes using the parse() method and appended it to the list using the append() method. After execution of the for loop, we get the list of dataframes as shown above.

Pandas Read Multiple Excel Sheets into a Dictionary

Instead of a list, we can read an Excel sheet into a Python dictionary. Here, the keys of the Python dictionary are the sheet names and the values are the dataframes corresponding to each sheet. We will read the Excel file using the read_excel() method. However, we will set the sheet_name parameter to None in the read_excel() method. After execution of the read_excel() method, we will get the dictionary containing the dataframe as shown below.

import pandas as pd
myDict= pd.read_excel('sample_excel_file.xlsx',sheet_name=None)
print("The dictionary of dataframes is:")
print(myDict)

Output:

The dictionary of dataframes is:
{'class_1':    Student  Physics  Chemistry  Biology
0     Wade       92         76       73
1     Dave       95         96       71
2     Ivan       65         62       95
3    Riley       68         92       66
4  Gilbert       74         95       76
5    Jorge       99         79       91
6      Dan       72         94       65, 
'class_2':      Student  Physics  Chemistry  Math
0      Harry       68         92    69
1        Dan       74         95    96
2     Joshua       99         79    77
3   Jennifer       72         94    61
4       Lisa       98         99    93
5   Patricia       93         67    78
6  Elizabeth       99         76    78,
 'class_3':   Student  Physics  Math  Biology
0  Aditya       92    95       73
1   Chris       95    79       71
2     Sam       65    75       95
3   Harry       68    69       66
4    Golu       74    96       76
5    Joel       99    77       91
6     Tom       72    61       65
7   Harsh       98    93       95
8   Clara       93    78       79
9    Tina       99    78       94}

In this example, the read_excel() method returned a dictionary containing dataframes instead of a single dataframe. This is due to the reason that we have set the sheet_name parameter to None.

Read Multiple Excel Sheets into a Single Pandas DataFrame

To read multiple Excel sheets into a single dataframe, we will first read the sheets into a list of dataframes using the approaches discussed in the previous sections. Then, we will use the pandas concat operation to concatenate all the dataframes. You can observe this in the following example.

import pandas as pd
xls = pd.ExcelFile('sample_excel_file.xlsx')
sheet=xls.sheet_names
print("The sheet names in the excel file are:")
print(sheet)
df_list=[]
for name in sheet:
    df=pd.read_excel(xls, name)
    df_list.append(df)
output_df=pd.concat(df_list, ignore_index=True)
print("The output dataframe is:")
print(output_df)

Output:

The sheet names in the excel file are:
['class_1', 'class_2', 'class_3']
The output dataframe is:
      Student  Physics  Chemistry  Biology  Math
0        Wade       92       76.0     73.0   NaN
1        Dave       95       96.0     71.0   NaN
2        Ivan       65       62.0     95.0   NaN
3       Riley       68       92.0     66.0   NaN
4     Gilbert       74       95.0     76.0   NaN
5       Jorge       99       79.0     91.0   NaN
6         Dan       72       94.0     65.0   NaN
7       Harry       68       92.0      NaN  69.0
8         Dan       74       95.0      NaN  96.0
9      Joshua       99       79.0      NaN  77.0
10   Jennifer       72       94.0      NaN  61.0
11       Lisa       98       99.0      NaN  93.0
12   Patricia       93       67.0      NaN  78.0
13  Elizabeth       99       76.0      NaN  78.0
14     Aditya       92        NaN     73.0  95.0
15      Chris       95        NaN     71.0  79.0
16        Sam       65        NaN     95.0  75.0
17      Harry       68        NaN     66.0  69.0
18       Golu       74        NaN     76.0  96.0
19       Joel       99        NaN     91.0  77.0
20        Tom       72        NaN     65.0  61.0
21      Harsh       98        NaN     95.0  93.0
22      Clara       93        NaN     79.0  78.0
23       Tina       99        NaN     94.0  78.0

In the above example, we have used the read_csv() method to read the sheets from the Excel file. Instead, you can use the approach using the parse() method as shown below.

import pandas as pd
xls = pd.ExcelFile('sample_excel_file.xlsx')
sheet=xls.sheet_names
print("The sheet names in the excel file are:")
print(sheet)
number_of_sheets=len(sheet)
df_list=[]
for position in range(number_of_sheets):
    df=xls.parse(position)
    df_list.append(df)
output_df=pd.concat(df_list, ignore_index=True)
print("The output dataframe is:")
print(output_df)

Output:

The sheet names in the excel file are:
['class_1', 'class_2', 'class_3']
The output dataframe is:
      Student  Physics  Chemistry  Biology  Math
0        Wade       92       76.0     73.0   NaN
1        Dave       95       96.0     71.0   NaN
2        Ivan       65       62.0     95.0   NaN
3       Riley       68       92.0     66.0   NaN
4     Gilbert       74       95.0     76.0   NaN
5       Jorge       99       79.0     91.0   NaN
6         Dan       72       94.0     65.0   NaN
7       Harry       68       92.0      NaN  69.0
8         Dan       74       95.0      NaN  96.0
9      Joshua       99       79.0      NaN  77.0
10   Jennifer       72       94.0      NaN  61.0
11       Lisa       98       99.0      NaN  93.0
12   Patricia       93       67.0      NaN  78.0
13  Elizabeth       99       76.0      NaN  78.0
14     Aditya       92        NaN     73.0  95.0
15      Chris       95        NaN     71.0  79.0
16        Sam       65        NaN     95.0  75.0
17      Harry       68        NaN     66.0  69.0
18       Golu       74        NaN     76.0  96.0
19       Joel       99        NaN     91.0  77.0
20        Tom       72        NaN     65.0  61.0
21      Harsh       98        NaN     95.0  93.0
22      Clara       93        NaN     79.0  78.0
23       Tina       99        NaN     94.0  78.0

You can also read the Excel sheets into a dictionary and then concatenate them into a single dataframe as shown below.

import pandas as pd
myDict= pd.read_excel('sample_excel_file.xlsx',sheet_name=None)
print("The output dataframe is:")
df=pd.concat(myDict, ignore_index=True)
print(df)

Output:

The output dataframe is:
      Student  Physics  Chemistry  Biology  Math
0        Wade       92       76.0     73.0   NaN
1        Dave       95       96.0     71.0   NaN
2        Ivan       65       62.0     95.0   NaN
3       Riley       68       92.0     66.0   NaN
4     Gilbert       74       95.0     76.0   NaN
5       Jorge       99       79.0     91.0   NaN
6         Dan       72       94.0     65.0   NaN
7       Harry       68       92.0      NaN  69.0
8         Dan       74       95.0      NaN  96.0
9      Joshua       99       79.0      NaN  77.0
10   Jennifer       72       94.0      NaN  61.0
11       Lisa       98       99.0      NaN  93.0
12   Patricia       93       67.0      NaN  78.0
13  Elizabeth       99       76.0      NaN  78.0
14     Aditya       92        NaN     73.0  95.0
15      Chris       95        NaN     71.0  79.0
16        Sam       65        NaN     95.0  75.0
17      Harry       68        NaN     66.0  69.0
18       Golu       74        NaN     76.0  96.0
19       Joel       99        NaN     91.0  77.0
20        Tom       72        NaN     65.0  61.0
21      Harsh       98        NaN     95.0  93.0
22      Clara       93        NaN     79.0  78.0
23       Tina       99        NaN     94.0  78.0

Conclusion

In this article, we discussed different ways to read an Excel file into a pandas dataframe. To learn more about Python programming, you can read this article on how to convert JSON to INI format in Python. You might also like this article on how to convert XML to JSON in Python.

I hope you enjoyed reading this article. Stay tuned for more informative articles.

Happy Learning!

Related

Recommended Python Training

Course: Python 3 For Beginners

Over 15 hours of video content with guided instruction for beginners. Learn how to create real world applications and master the basics.

Enroll Now

Filed Under: Basics Author: Aditya Raj

More Python Topics

API Argv Basics Beautiful Soup Cheatsheet Code Code Snippets Command Line Comments Concatenation crawler Data Structures Data Types deque Development Dictionary Dictionary Data Structure In Python Error Handling Exceptions Filehandling Files Functions Games GUI Json Lists Loops Mechanzie Modules Modules In Python Mysql OS pip Pyspark Python Python On The Web Python Strings Queue Requests Scraping Scripts Split Strings System & OS urllib2

Primary Sidebar

Menu

  • Basics
  • Cheatsheet
  • Code Snippets
  • Development
  • Dictionary
  • Error Handling
  • Lists
  • Loops
  • Modules
  • Scripts
  • Strings
  • System & OS
  • Web

Get Our Free Guide To Learning Python

Most Popular Content

  • Reading and Writing Files in Python
  • Python Dictionary – How To Create Dictionaries In Python
  • How to use Split in Python
  • Python String Concatenation and Formatting
  • List Comprehension in Python
  • How to Use sys.argv in Python?
  • How to use comments in Python
  • Try and Except in Python

Recent Posts

  • Count Rows With Null Values in PySpark
  • PySpark OrderBy One or Multiple Columns
  • Select Rows with Null values in PySpark
  • PySpark Count Distinct Values in One or Multiple Columns
  • PySpark Filter Rows in a DataFrame by Condition

Copyright © 2012–2025 · PythonForBeginners.com

  • Home
  • Contact Us
  • Privacy Policy
  • Write For Us