Welcome to The Coding College, where we provide hands-on exercises to solidify your programming skills. In this post, we’ll dive into Pandas exercises to help you practice data manipulation, cleaning, and analysis.
Why Practice Pandas?
Practicing with Pandas helps:
- Build confidence in handling real-world datasets.
- Strengthen your understanding of core concepts like DataFrames and Series.
- Improve your problem-solving skills in data analysis tasks.
Getting Started
Before starting, make sure you have Pandas installed. Use the following command:
pip install pandas
Let’s import Pandas for our exercises:
import pandas as pd
Beginner-Level Exercises
1. Create a Pandas Series
Create a Pandas Series from a list [10, 20, 30, 40, 50]
.
# Your code here
2. Create a DataFrame
Create a DataFrame from the following dictionary:
data = {
"Name": ["Alice", "Bob", "Charlie"],
"Age": [25, 30, 35],
"Salary": [50000, 60000, 70000]
}
# Your code here
3. Read a CSV File
Download a sample CSV file (e.g., Sample Data) and read it into a Pandas DataFrame.
# Your code here
4. Display Basic Information
Use the .info()
and .describe()
methods to summarize the DataFrame.
# Your code here
5. Select Specific Columns
From the DataFrame in Exercise 2, display only the Name
and Salary
columns.
# Your code here
Intermediate-Level Exercises
6. Filter Data
Filter the DataFrame to show rows where Age > 28
.
# Your code here
7. Add a New Column
Add a new column, Bonus
, which is 10% of the Salary
for each row.
# Your code here
8. Group Data
Group the data by Age
and calculate the mean salary for each age group.
# Your code here
9. Handle Missing Values
Create a DataFrame with some missing values:
data = {
"A": [1, 2, None, 4],
"B": [None, 2, 3, 4]
}
Fill the missing values in column A
with the column mean.
# Your code here
10. Merge Two DataFrames
Given two DataFrames:
df1 = pd.DataFrame({"ID": [1, 2, 3], "Name": ["Alice", "Bob", "Charlie"]})
df2 = pd.DataFrame({"ID": [1, 2, 4], "Score": [85, 90, 88]})
Merge them on the ID
column.
# Your code here
Advanced-Level Exercises
11. Pivot Table
Create a pivot table for the following DataFrame:
data = {
"Region": ["North", "South", "North", "East"],
"Sales": [200, 150, 300, 400],
"Year": [2021, 2021, 2022, 2022]
}
df = pd.DataFrame(data)
Summarize Sales
by Region
and Year
.
# Your code here
12. Calculate Correlations
For the following DataFrame, calculate the correlation between columns:
data = {
"X": [1, 2, 3, 4, 5],
"Y": [5, 4, 3, 2, 1],
"Z": [2, 3, 4, 5, 6]
}
df = pd.DataFrame(data)
# Your code here
13. Time Series Data
Create a time-indexed DataFrame for the dates 2023-01-01
to 2023-01-07
and populate it with random sales data.
# Your code here
14. Export Data
Export the DataFrame from Exercise 13 to a CSV file named sales_data.csv
.
# Your code here
15. Advanced Cleaning
Given a DataFrame with inconsistent data:
data = {
"Name": ["Alice", "alice ", "BOB", " Charlie"],
"Age": [25, None, 30, 35],
"Salary": ["50K", "60k", "70000", "$80,000"]
}
df = pd.DataFrame(data)
- Strip extra spaces and standardize the
Name
column. - Fill missing values in the
Age
column with the mean age. - Convert the
Salary
column to numeric values.
# Your code here
Learning Beyond Exercises
Practice makes perfect, but learning doesn’t stop here! Visit The Coding College to:
- Access solutions to these exercises.
- Explore more advanced tutorials on data cleaning, visualization, and analysis.
- Work on real-world projects to apply your Pandas knowledge.
Conclusion
These exercises are designed to challenge and enhance your Pandas skills. Whether you’re a beginner or an advanced user, consistent practice will help you become proficient in handling data.