R Scatter Plot

Welcome to The Coding College! In this tutorial, we’ll explore scatter plots in R, one of the most commonly used visualization tools for analyzing relationships between two variables. Whether you’re working with small datasets or large-scale data analysis, scatter plots help you identify patterns, trends, and correlations.

By the end of this guide, you’ll learn:

  • How to create a scatter plot in R.
  • How to customize scatter plots with colors, markers, and legends.
  • How to add trend lines and analyze data relationships effectively.

What is a Scatter Plot?

A scatter plot is a graph that displays data points on a two-dimensional plane. Each point represents an observation, with its position determined by two variables: one plotted along the x-axis and the other along the y-axis.

Scatter plots are ideal for:

  • Visualizing Correlations: Understanding relationships between variables.
  • Identifying Clusters: Spotting groupings or patterns in data.
  • Detecting Outliers: Recognizing data points that deviate significantly from others.

Creating a Scatter Plot in R

Basic Scatter Plot with plot()

The plot() function in R is the simplest way to create a scatter plot.

Example: Scatter Plot

# Sample data
x <- c(1, 2, 3, 4, 5)
y <- c(2, 4, 6, 8, 10)

# Create a scatter plot
plot(x, y, main = "Basic Scatter Plot", xlab = "X-Axis", ylab = "Y-Axis")

This creates a simple scatter plot with labeled axes and a title.

Customizing Scatter Plots

1. Change Point Colors

The col argument allows you to specify the color of the points.

plot(x, y, col = "blue", pch = 16, main = "Scatter Plot with Custom Colors")

2. Adjust Point Shapes

The pch argument controls the shape of the points:

  • pch = 1: Open circle (default)
  • pch = 16: Solid circle
  • pch = 17: Solid triangle
plot(x, y, col = "red", pch = 17, main = "Scatter Plot with Custom Shapes")

3. Change Point Size

Use the cex argument to adjust point size.

plot(x, y, cex = 1.5, col = "green", main = "Scatter Plot with Larger Points")

Adding Additional Features to Scatter Plots

1. Add a Grid

Use the grid() function to add a grid for better readability.

plot(x, y, main = "Scatter Plot with Grid")
grid()

2. Add a Legend

The legend() function helps identify groups or categories in your data.

# Create a scatter plot
plot(x, y, col = "blue", pch = 16, main = "Scatter Plot with Legend")

# Add a legend
legend("topleft", legend = "Group 1", col = "blue", pch = 16)

Multiple Scatter Plots on the Same Graph

You can visualize multiple datasets on a single scatter plot using the points() function.

Example: Multiple Datasets

# Additional dataset
x2 <- c(1, 2, 3, 4, 5)
y2 <- c(3, 6, 9, 12, 15)

# Plot the first dataset
plot(x, y, col = "blue", pch = 16, main = "Multiple Scatter Plots", xlab = "X-Axis", ylab = "Y-Axis")

# Add the second dataset
points(x2, y2, col = "red", pch = 17)

# Add a legend
legend("topleft", legend = c("Dataset 1", "Dataset 2"), col = c("blue", "red"), pch = c(16, 17))

Adding Trend Lines to Scatter Plots

Trend lines are useful for highlighting the relationship between variables. Use the abline() function to add a linear trend line.

Example: Add a Trend Line

# Create a scatter plot
plot(x, y, col = "blue", pch = 16, main = "Scatter Plot with Trend Line")

# Add a linear trend line
abline(lm(y ~ x), col = "red", lwd = 2)

Here, lm(y ~ x) fits a linear model to the data.

Advanced Scatter Plots with ggplot2

The ggplot2 package offers advanced customization and styling options for scatter plots.

Install and Load ggplot2

install.packages("ggplot2")
library(ggplot2)

Example: Scatter Plot with ggplot2

# Create a data frame
data <- data.frame(x = x, y = y)

# Create a scatter plot
ggplot(data, aes(x = x, y = y)) +
  geom_point(color = "blue", size = 3) +
  ggtitle("Scatter Plot with ggplot2") +
  xlab("X-Axis") +
  ylab("Y-Axis")

Adding a Trend Line in ggplot2

ggplot(data, aes(x = x, y = y)) +
  geom_point(color = "blue", size = 3) +
  geom_smooth(method = "lm", col = "red") +
  ggtitle("Scatter Plot with Trend Line in ggplot2") +
  xlab("X-Axis") +
  ylab("Y-Axis")

Exporting Scatter Plots

Save your scatter plots as image files using jpeg(), png(), or pdf().

Example: Save as PNG

png("scatter_plot.png")
plot(x, y, col = "blue", pch = 16, main = "Exported Scatter Plot")
dev.off()

Tips for Effective Scatter Plots

  1. Use Colors to Highlight Categories: Use different colors for distinct groups or clusters.
  2. Label Axes Clearly: Always label your axes to provide context to the data.
  3. Keep It Simple: Avoid overloading your scatter plot with too many datasets or annotations.

FAQs About Scatter Plots in R

1. How can I add text annotations to a scatter plot?

Use the text() function to annotate specific points.

text(3, 6, "Point 3", col = "blue")

2. Can I create interactive scatter plots?

Yes, libraries like plotly allow you to create interactive scatter plots.

3. How do I scale point sizes based on a variable?

Use the cex argument in base R or the size aesthetic in ggplot2.

Conclusion

Scatter plots are essential tools for visualizing data relationships. With R’s plot() function and the ggplot2 package, you can create both simple and highly customized scatter plots to meet your needs. Start practicing today to unlock new insights from your data!

Leave a Comment