ggplot2 vs base R graphics: An example

It took me a while to make the switch over to ggplot2 because I learned how to do things in R mostly with base R.  But I’m glad I finally made the switch.  ggplot2 is so much better in so many ways, and here is an example of how much easier it is to code and how much better the output looks.  (Side note: When I make art in R, I exclusively use base R.)

So anyway, on Tuesday I talked to our incoming graduate students about R, and presented some basics of data input and data visualization.  I gave a simple example of base R vs ggplot2 using a histogram and then a scatter plot.  After the talk, a colleague of mine sent me the following code for an example plot that he had made, and wanted to know how to do it in ggplot2.  Take a look at all this code!

summary(CO2)
str(CO2)
head(CO2, 20)
CO2.QC <- subset(CO2, Type == "Quebec" & Treatment == "chilled")
CO2.QN <- subset(CO2, Type == "Quebec" & Treatment == "nonchilled")
CO2.MC <- subset(CO2, Type == "Mississippi" & Treatment == "chilled")
CO2.MN <- subset(CO2, Type == "Mississippi" & Treatment == "nonchilled")
xrange <- range(CO2$conc)
yrange<-range(CO2$uptake)
png("/Users/gregorymatthews/Dropbox/StatsInTheWild/baseRplot.png",res = 300, units = "in", h = 5, w = 10)
par(mfrow = c(1, 2))
plot(CO2.QC$uptake ~ CO2.QC$conc, pch = 19, lty = 2, lwd = 3, cex = 1.5, las = 1,
ylim = yrange, xlim = xrange, col = "purple", xlab = "Concentration",
ylab = "Uptake")
par(new=T)
plot(CO2.QN$uptake ~ CO2.QN$conc, pch = 20, lty = 2, lwd = 3, cex = 1.5, las = 1,
ylim = yrange, xlim = xrange, col = "lightblue", main = "Quebec", xlab = "",
ylab = "")
legend("bottomright", title = "Treatment", c("Chilled", "Nonchilled"),
pch = c(19, 20), col = c("purple", "lightblue"))
plot(CO2.MC$uptake ~ CO2.MC$conc, pch = 19, lty = 2, lwd = 3, cex = 1.5, las = 1,
ylim = yrange, xlim = xrange, col = "orange", xlab = "Concentration",
ylab = "Uptake")
par(new=T)
plot(CO2.MN$uptake ~ CO2.MN$conc, pch = 20, lty = 2, lwd = 3, cex = 1.5,
ylim = yrange, las = 1,
xlim = xrange, col = "red", main = "Mississippi", xlab = "", ylab = "")
legend("topleft", title = "Treatment", c("Chilled", "Nonchilled"), pch = c(19,20),
col = c("orange", "red"))
par(mfrow = c(1, 1), las = 1)
dev.off()

All that code produces the following plot:

baseRplot

These are fine looking graphs, but you have to manually choose the x and y limits, the legends look weird and they have to be manually added to each plot, and it would be nice to have some grid lines in the plots, which can be added, but that addition must be done manually.  (As a fun exercise, try to recreate the plot above using ggplot2. Don’t cheat!)

Now take a look at how easy this is to do in ggplot2!


library(ggplot2)
png("/Users/gregorymatthews/Dropbox/StatsInTheWild/ggplot2plot.png",res = 300, units = "in", h = 5, w = 10)
ggplot(aes(x = conc, y = uptake, color = Treatment), data = CO2) + geom_point() + facet_grid( ~ Type) + xlab("Concentration") + ylab("Uptake") + labs(color = "Trt") + scale_color_manual(values = c("purple","blue"))
dev.off()

The code is so much more concise, it’s easier to read, the x and y limits were chosen automatically, and the output looks so much nicer!

 

ggplot2plot

What I’m trying to say is that in almost every situation ggplot2 > base R.

Cheers.

Posted on August 22, 2019, in Uncategorized. Bookmark the permalink. Leave a comment.

Leave a Reply

Loading cart ⌛️ ...
%d bloggers like this: