Matplotlib Scatter

Welcome to The Coding College, your go-to source for learning programming and data visualization! In this tutorial, we’ll cover Matplotlib Scatter Plots—a tool essential for visualizing relationships and distributions in data. Scatter plots are invaluable for analyzing correlations, clusters, and outliers.

What Is a Scatter Plot?

A scatter plot displays data points as individual markers on a two-dimensional grid, with the x-axis representing one variable and the y-axis representing another. It’s a perfect choice for identifying trends or clusters in data.

Creating a Basic Scatter Plot

To create a scatter plot in Matplotlib, use the plt.scatter() function.

Example: Basic Scatter Plot

import matplotlib.pyplot as plt  

# Sample data
x = [5, 7, 8, 7, 2, 17, 2, 9, 4, 11]  
y = [99, 86, 87, 88, 100, 86, 103, 87, 94, 78]  

# Create scatter plot
plt.scatter(x, y)  
plt.title("Basic Scatter Plot")  
plt.xlabel("X-axis Label")  
plt.ylabel("Y-axis Label")  
plt.show()  

Output: A basic scatter plot with points representing the data.

Customizing Scatter Plots

1. Changing Marker Size

Adjust the size of the markers using the s parameter:

sizes = [20, 50, 100, 200, 500, 100, 70, 40, 300, 10]  
plt.scatter(x, y, s=sizes)  
plt.title("Scatter Plot with Custom Marker Sizes")  
plt.show()  

2. Changing Marker Color

Specify colors with the c parameter. Use a single color or a colormap for gradients:

colors = [5, 10, 15, 20, 25, 30, 35, 40, 45, 50]  
plt.scatter(x, y, c=colors, cmap="viridis")  # Apply a colormap  
plt.colorbar()  # Add color legend
plt.title("Scatter Plot with Gradient Colors")  
plt.show()  

3. Changing Marker Style

Choose different marker shapes using the marker parameter.

plt.scatter(x, y, marker="x", color="red")  
plt.title("Scatter Plot with 'X' Markers")  
plt.show()  

4. Adding Transparency

Control marker transparency with the alpha parameter (range: 0 to 1):

plt.scatter(x, y, alpha=0.5, color="blue")  
plt.title("Scatter Plot with Transparency")  
plt.show()  

Comparing Two Datasets in One Scatter Plot

You can plot multiple datasets by calling plt.scatter() multiple times.

x1 = [1, 3, 5, 7, 9]  
y1 = [5, 15, 25, 35, 45]  

x2 = [2, 4, 6, 8, 10]  
y2 = [10, 20, 30, 40, 50]  

plt.scatter(x1, y1, color="blue", label="Dataset 1")  
plt.scatter(x2, y2, color="green", label="Dataset 2")  

plt.title("Comparing Two Datasets")  
plt.legend()  
plt.show()  

Advanced Scatter Plot: Adding Annotations

You can annotate specific data points to highlight them:

plt.scatter(x, y, color="purple")  

# Annotate a specific point
plt.annotate("Important Point", (7, 88), textcoords="offset points", xytext=(10, -15), arrowprops=dict(arrowstyle="->"))  

plt.title("Scatter Plot with Annotations")  
plt.show()  

Practical Exercises

Exercise 1: Custom Scatter Plot

Create a scatter plot with:

  • Different marker sizes based on a list.
  • Gradient colors using a colormap.

Exercise 2: Multi-Dataset Comparison

Create a scatter plot to compare three datasets, each with a unique color and marker style. Add a legend for clarity.

Common Issues and Solutions

  1. Markers Overlap
    • Cause: Dense data points in a small range.
    • Solution: Use transparency (alpha) or jitter the data slightly.
  2. Unclear Axis Labels
    • Cause: Missing or ambiguous labels.
    • Solution: Always include clear xlabel and ylabel.
  3. Hard-to-Read Plot
    • Cause: Poor choice of colors or marker styles.
    • Solution: Use contrasting colors and larger markers.

Why Learn with The Coding College?

At The Coding College, we aim to provide practical, easy-to-follow tutorials for learners of all levels. Mastering scatter plots in Matplotlib will help you explore data relationships and present your findings effectively.

Conclusion

Scatter plots in Matplotlib are a powerful tool for data analysis and visualization. With features like marker customization, color gradients, and annotations, you can create professional and insightful plots tailored to your needs.

Leave a Comment