Common scenarios of reading data from files, making a minor adjustment, and then saving the results are demonstrated. This simulates a frequent pattern in data analysis: load, process, save.
First, ensure you have Pandas imported. Typically, we use the alias pd.
import pandas as pd
print(f"Pandas version: {pd.__version__}")
For these exercises, we need some data files. Let's imagine we have two files:
students.csv: A Comma Separated Value file.grades.xlsx: A Microsoft Excel file.If you are running this code locally, you can create these files yourself.
Create students.csv:
Save the following text into a file named students.csv in the same directory as your script or notebook:
StudentID,Name,Major
101,Alice,Physics
102,Bob,Chemistry
103,Charlie,Mathematics
104,David,Physics
Create grades.xlsx:
You'll need a library like openpyxl installed (pip install openpyxl) to work with .xlsx files. Create an Excel file named grades.xlsx with the following data in a sheet named Sheet1:
| StudentID | Course | Grade |
|---|---|---|
| 101 | Mechanics | 85 |
| 102 | Org Chem | 92 |
| 101 | E&M | 78 |
| 103 | Calculus | 95 |
| 102 | Thermo | 88 |
| 104 | Mechanics | 81 |
Now, let's read the students.csv file into a Pandas DataFrame using pd.read_csv().
# Read the CSV file
students_df = pd.read_csv('students.csv')
# Display the DataFrame
print("Students DataFrame:")
print(students_df)
You'll notice Pandas correctly identified the first row as the header and used a default integer index (0, 1, 2, 3).
What if our file used a different separator, like a tab? Or what if we wanted the StudentID column to be the DataFrame's index? We can use parameters in read_csv.
# Example: Reading assuming StudentID should be the index
students_df_indexed = pd.read_csv('students.csv', index_col='StudentID')
print("\nStudents DataFrame with StudentID as index:")
print(students_df_indexed)
Setting index_col='StudentID' tells Pandas to use the values from that column as the row labels (the index) for the DataFrame.
Next, let's load the data from the grades.xlsx file using pd.read_excel().
# Read the Excel file (ensure openpyxl or another engine is installed)
try:
grades_df = pd.read_excel('grades.xlsx')
print("\nGrades DataFrame (from Excel Sheet1):")
print(grades_df)
except ImportError:
print("\nPlease install 'openpyxl' to read Excel files: pip install openpyxl")
except FileNotFoundError:
print("\nMake sure 'grades.xlsx' is in the correct directory.")
By default, read_excel reads the first sheet. If our data was on a different sheet, say "CourseGrades", we would specify it:
# Example: Reading a specific sheet (if it existed)
# grades_df_specific_sheet = pd.read_excel('grades.xlsx', sheet_name='CourseGrades')
You can use the sheet name (string) or its zero-based index (integer) for sheet_name.
Let's take the grades_df we loaded and add a new column. Suppose we want a 'Status' column indicating 'Pass' if the grade is 70 or above, and 'Fail' otherwise. We can achieve this using boolean indexing and assignment.
# Check if grades_df was loaded successfully before proceeding
if 'grades_df' in locals():
# Add a 'Status' column, default to 'Fail'
grades_df['Status'] = 'Fail'
# Update 'Status' to 'Pass' where Grade >= 70
grades_df.loc[grades_df['Grade'] >= 70, 'Status'] = 'Pass'
print("\nGrades DataFrame with Status column added:")
print(grades_df)
else:
print("\nSkipping modification as grades_df was not loaded.")
We first created the column and assigned a default value ('Fail'). Then, we used .loc with a boolean condition (grades_df['Grade'] >= 70) to select the rows where the grade met the passing threshold and updated only their 'Status' value to 'Pass'.
Now that we have modified grades_df, let's save it back to a new CSV file named processed_grades.csv.
A common requirement is to not write the DataFrame's index (0, 1, 2...) into the file, as it often doesn't represent meaningful data. We use the index=False parameter for this.
# Check if grades_df exists before saving
if 'grades_df' in locals():
# Save the modified DataFrame to CSV without the index
grades_df.to_csv('processed_grades.csv', index=False)
print("\nSaved modified grades data to processed_grades.csv (index excluded).")
# You can verify by reading it back or opening the file
# check_csv = pd.read_csv('processed_grades.csv')
# print("\nVerification read from processed_grades.csv:")
# print(check_csv)
else:
print("\nSkipping CSV export as grades_df is not available.")
If you open processed_grades.csv, you'll see the columns StudentID, Course, Grade, and Status, but not the default 0-based index column.
Similarly, we can save the modified DataFrame to an Excel file. Let's call it processed_grades.xlsx and put the data in a sheet named 'Final Grades'. Again, we'll typically want index=False.
# Check if grades_df exists before saving
if 'grades_df' in locals():
try:
# Save the modified DataFrame to Excel without the index, specifying sheet name
grades_df.to_excel('processed_grades.xlsx', sheet_name='Final Grades', index=False)
print("\nSaved modified grades data to processed_grades.xlsx (Sheet: 'Final Grades', index excluded).")
# Verification (optional)
# check_excel = pd.read_excel('processed_grades.xlsx', sheet_name='Final Grades')
# print("\nVerification read from processed_grades.xlsx:")
# print(check_excel)
except ImportError:
print("\nPlease install 'openpyxl' to write Excel files: pip install openpyxl")
else:
print("\nSkipping Excel export as grades_df is not available.")
This creates an Excel file with our processed data neatly organized in the specified sheet.
In this practice session, you used the fundamental Pandas IO functions:
pd.read_csv() to load data from text files, exploring index_col.pd.read_excel() to load data from spreadsheets, exploring sheet_name..to_csv() to save DataFrames, emphasizing the use of index=False..to_excel() to save DataFrames, using index=False and sheet_name.You also performed a simple data modification step between reading and writing, demonstrating a basic data processing workflow. Mastering these input and output operations is essential for integrating Pandas into your data analysis tasks.
Was this section helpful?
© 2026 ApX Machine LearningEngineered with