Pandas – Plotting

Welcome to The Coding College, your resource for mastering coding and data visualization techniques. In this guide, we’ll explore how to use Pandas’ plotting capabilities to create compelling visualizations that make your data come to life.

Why Plot Data?

Visualizations help:

  • Identify patterns and trends in your data.
  • Communicate insights effectively to stakeholders.
  • Simplify complex datasets for analysis.

Pandas Plotting Basics

Pandas integrates with Matplotlib, a popular Python plotting library. Using the .plot() method, you can quickly generate a variety of plots.

import pandas as pd
import matplotlib.pyplot as plt

Sample Dataset

Let’s create a dataset to work with:

data = {
    "Month": ["Jan", "Feb", "Mar", "Apr", "May", "Jun"],
    "Sales": [1500, 1800, 2000, 2200, 2500, 3000],
    "Expenses": [1200, 1400, 1600, 1700, 2000, 2300]
}

df = pd.DataFrame(data)
print(df)

Output:

  Month  Sales  Expenses
0   Jan   1500      1200
1   Feb   1800      1400
2   Mar   2000      1600
3   Apr   2200      1700
4   May   2500      2000
5   Jun   3000      2300

Types of Plots

1. Line Plot

Useful for visualizing trends over time.

df.plot(x="Month", y="Sales", kind="line", title="Monthly Sales", marker="o")
plt.ylabel("Sales ($)")
plt.show()

2. Bar Plot

Great for comparing categorical data.

df.plot(x="Month", y=["Sales", "Expenses"], kind="bar", title="Monthly Sales and Expenses")
plt.ylabel("Amount ($)")
plt.show()

3. Horizontal Bar Plot

A horizontal alternative to bar plots.

df.plot(x="Month", y="Sales", kind="barh", title="Monthly Sales (Horizontal)")
plt.xlabel("Sales ($)")
plt.show()

4. Scatter Plot

Shows relationships between two variables.

df.plot(kind="scatter", x="Sales", y="Expenses", title="Sales vs Expenses", color="green")
plt.xlabel("Sales ($)")
plt.ylabel("Expenses ($)")
plt.show()

5. Histogram

Displays the distribution of a single variable.

df["Sales"].plot(kind="hist", title="Sales Distribution", bins=5, color="orange")
plt.xlabel("Sales ($)")
plt.show()

6. Box Plot

Highlights the spread and outliers in your data.

df[["Sales", "Expenses"]].plot(kind="box", title="Box Plot of Sales and Expenses")
plt.show()

7. Pie Chart

Visualizes proportions in your data.

df["Sales"].plot(kind="pie", labels=df["Month"], autopct="%1.1f%%", title="Sales Distribution")
plt.ylabel("")  # Hide the y-axis label for clarity
plt.show()

Customizing Plots

Adding Titles, Labels, and Legends

ax = df.plot(x="Month", y="Sales", kind="line", marker="o")
ax.set_title("Customized Monthly Sales")
ax.set_xlabel("Month")
ax.set_ylabel("Sales ($)")
ax.legend(["Sales"])
plt.show()

Changing Plot Style

You can change the style using Matplotlib styles:

plt.style.use("ggplot")  # Change the style
df.plot(x="Month", y="Sales", kind="line", title="Sales with GGPlot Style")
plt.show()

Real-World Applications of Pandas Plotting

  1. Business Analytics: Track key metrics like sales, expenses, or profits.
  2. Data Science: Visualize data distributions and relationships for exploratory analysis.
  3. Operations Management: Monitor trends and variations in operational data.

Learn with The Coding College

At The Coding College, we help you enhance your programming and data visualization skills. Visit The Coding College for:

  • Beginner-friendly tutorials on data manipulation and visualization.
  • Projects that let you apply your knowledge in real-world scenarios.
  • Insights into mastering data analysis tools like Pandas and Matplotlib.

Conclusion

Pandas’ plotting capabilities allow you to create powerful visualizations with minimal effort. By leveraging these techniques, you can bring your data to life and uncover valuable insights.

Leave a Comment