R has excellent graphics and plotting capabilities, which can mostly be found in 3 main sources: base graphics, the lattice package, the ggplot2 package. The latter two are built on the highly flexible grid graphics package, while the base graphics routines adopt a pen and paper

model for plotting, mostly written in Fortran, which date back to the early days of S, the precursor to R (for more on this, see the book Software for Data Analysis - Programming with R by John Chambers, which has lots of very useful information).

Base graphics

For a basic introduction, see the "getting started" page here. Base graphics are very flexible and allow a great deal of customisation, with many individual functions available. However, they lack a coherent underlying framework and, for visualizing highly structured data, are outclassed by lattice and ggplot2.

demo("graphics") # Demonstration of graphics in R ?plot # Help page for main plot function ?par # Help page for changing graphical parameters ?layout # Help page on plot arrangement example("pch") # Point style examples colours() # List pre-defined named colours ?plotmath # Help page on plotting maths symbols demo(plotmath)

lines, points, abline, curve, text, rug, legend segments, arrows, polygon locator, identify # For interacting with plots

x <- 10 + (1:20)/10 y <- x^2 + rnorm(length(x)) # Add Gaussian random number plot(x, y) curve(x^2, add=TRUE, lty=2) # Add dashed line showing y=x^2 plot(x, y, type="l", col="blue") # Plot as blue line (try 'type="o"') plot(x, y, type="l", log="xy") # Plot as line with log X & Y axes: abline(v=11, lty=3) # Add vertical dotted line text(11.5, 120, "Hello") # Add annotation legend("topleft", inset=0.05, "data", pch=1, col="blue", bty="n") # Add a legend

plot(x, y, pch=2, col="red") # Hollow triangles plot(1:10, rep(1, 10), pch=LETTERS) # Can also use any character example("pch") # Show point style examples

recyclingof vectors in this situation to determine the attributes for each point, i.e. if the length of the vector is less than the number of points, the vector is repeated and concatenated to match the number required.

plot(x, y, pch=2, col="red") # Hollow triangles plot(x, y, pch=c(3, 20), col=c("red", "blue")) # Blue dots; red "+" signs plot(x, y, pch=1:20) # Different symbol for each point

rainbowpalette:

col <- rainbow(length(x)) plot(x, y, col=col)

plot(x, y, xlab="Some data", ylab="Wibble")

xlimand

ylim, which are vectors of the minimum and maximum values, respectively.

plot(x, y, xlim=c(11, 12), ylim=c(0, 150))

# Changing the plot layout

The basic idea behind the R function layout

is to divide the plotting device into a series of rows and columns specified by a matrix. The matrix itself is composed of values referring to the plot number, generally just 1,2,3...etc., but can feature repetition.

matrix(1:2) matrix(1:4) # 4x1 matrix(1:4, 2, 2) # 2x2 matrix(1:6, 3, 2) # 3x2 ordered by columns matrix(1:6, 3, 2, byrow=TRUE) # 3x2 ordered by rows

layout(matrix(1:4, 2, 2)) layout.show(4) # Specify layout for 4 panels, for the defined layout layout.show(2) # Try specifying just 2 instead

x <- 1:10 plot(x, x) plot(x, x^2) plot(x, sqrt(x)) plot(x, log10(x)) curve(log10, add=TRUE) # Adds to last panel plotted

heightsand

widthsarguments to

layoutare vectors of relative heights and widths of the matrix rows and columns, respectively.

layout(matrix(1:4, 2, 2), heights=c(2, 1)); layout.show(4) replicate(4, plot(x, x)) # Repeat plot 4 times

# Plotting a function or equation

The function curve

allows you to plot equations or complex functions, either on their own, or added to an existing plot (with add=T

).

curve(x^2) curve(x^1) # "curve(x)" fails! (can also use "curve(I(x))") curve(x^2+log10(x)-sin(x)) # Can use arithmetic curve(dnorm) # Normal distribution for mean=0, standard deviation=1 curve(x^3-x+1, from=-10, to=10, lty=2) # Specify range & use dashed line

curve(dnorm(x, mean=1, sd=2), from=-10, to=10)

curveprovides the function to be plotted with a vector of x-axis values called

xwith which to calculate the corresponding y-axis data. If the argument of your function is not called

x(e.g.

r) , then you need to use the following syntax:

`curve(myfun(r=x))`

. The following example illustrates this with a plot of several blackbody curves.
blackbody <- function(lambda, Temp=1e3) { h <- 6.626068e-34 ; c <- 3e8; kb <- 1.3806503e-23 # constants lambda <- lambda * 1e-6 # Convert from metres to microns ( 2*pi*h*c^2 ) / ( lambda^5*( exp( (h*c)/(lambda*kb*Temp) ) - 1 ) ) }

main <- "Planck blackbody curves" xlab <- expression(paste(Wavelength, " (", mu, "m)")) ylab <- expression(paste(Intensity, " ", (W/m^3))) col <- c("blue", "orange", "red") lty <- 1:3 curve(blackbody(lambda=x), from=1, to=15, main=main, xlab=xlab, ylab=ylab, col=col[1])

curve(blackbody(lambda=x, T=900), add=T, col=col[2], lty=lty[2]) curve(blackbody(lambda=x, T=800), add=T, col=col[3], lty=lty[3]) legtext <- paste(c(1000, 900, 800), "K", sep="") legend("topright", inset=0.05, legend=legtext, lty=lty, col=col, text.col=col)

dev.copy2pdf(file="blackbody.pdf") # Also "dev.copy2eps"

# Interacting with the plot

`locator()`

then left click with the mouse any number of times within the axes and right click to end; the R prompt will then return and a list will be printed with the X and Y coordinates of the positions clicked. You can retain this information by repeating the above, but with
`A <- locator()`

the coordinates will then be stored in `A$x`

and `A$y`

x <- 1:10; y <- x^2 plot(x, y) identify(x, y)

`locator`

This is more useful if you have named points, in which case

identifycan print the name instead of the element number, for example:

names(x) <- LETTERS[1:length(x)] plot(x, y) identify(x, y, labels=names(x)) # don't forget right click to finish!

Lattice graphics

Lattice is an excellent package for visualizing multivariate data, which is essentially a port of the S software trellis display to R. While it lacks the flexibility and extensibility of ggplot2, it nevertheless represents a great set of routines for quickly displaying complex data with ease. This makes it ideal for use in exploratory data analysis; you can find out more by reading the excellent book Lattice - Multivariate Data Visualization with R

by Deepayan Sarkar.

- Some examples of using lattice, first assemble some data (from this book) on the masses (in kg) and semi-major axis lengths (in metres) of the Planets and a dotplot of the former:
A histogram and a kernel-smoothed density plot of the semi-major axes:
planets.mass <- c("Mercury"=0.33, "Venus"=4.87, "Earth"=5.98, "Mars"=0.64, "Jupiter"=1899, "Saturn"=569, "Uranus"=87, "Neptune"=102, "Pluto"=0.13) * 1e24 planets.semimajoraxis <- c("Mercury"=57.9, "Venus"=108, "Earth"=150, "Mars"=228, "Jupiter"=778, "Saturn"=1430, "Uranus"=2870, "Neptune"=4500, "Pluto"=5900) * 1e9 require(lattice) # ensure package is loaded dotplot(sort(log10(planets.mass)), xlab="log10 mass (kg)")

Now to demonstrate the multivariate capabilities, assemble the data in a data frame and create a categorical variablehistogram(log10(planets.semimajoraxis)) densityplot(log10(planets.semimajoraxis)) # shows raw data as "jittered" points

giant

, which identifies the 4 most massive planets:Lattice can now separately handle the different categories, either by usingA <- data.frame(sma=planets.semimajoraxis, mass=planets.mass) A$name <- rownames(A) A$giant <- ifelse(A$mass>1e25, "Giant", "Not giant")

, to use different plotting symbols etc. within the same panel, e.g.:`group`...or bydotplot(reorder(name, sma) ~ log10(sma), data=A, xlab="log10 semi-major axis (m)", groups=giant, auto.key=TRUE)

conditioning

on a categorical variable, to plot separate panels for each dataset:dotplot(reorder(name, sma) ~ log10(sma) | giant, data=A, xlab="log10 semi-major axis (m)", auto.key=TRUE)

- You can also easily plot linear regression models (from

) for each group category, using the`lm`type

argument:xyplot(sma ~ mass, data=A, groups=giant, scales=list(log=TRUE), type=c("g", "p", "r"), auto.key=list(lines=TRUE)) # #---Other "type" arguments: # "g" = show gridlines # "p" = points # "l" = lines (join the dots) # "r" = linear regression model # "smooth" = locally-weighted regression using "loess" #

- Lattice offers a very quick route to visualize a set of properties conditioned on one or more factors. For example, to show boxplots of 4 different quantities in separate panels, with each panel comparing values in different categories:
file <- "http://www.sr.bham.ac.uk/~ajrs/papers/sanderson09/sanderson09_table2.txt" a <- read.table(file, header=TRUE, sep="|") #--This plot is actually saved as an R object "p" (for use below) and with the # outer "(" & ")" the result is also printed (i.e. plotted in this case, # since "printing" a lattice object draws the plot): ( p <- bwplot( z + kT + Z + index ~ cctype, data=a, outer=TRUE, scales="free", ylab="") )

- Another excellent feature of lattice is the ability to span plots over multiple pages, using the
layout

argument (which is a vector specifying the required number of columns, rows & pages for the plot panels). This is great if you are plotting a large number of panels and want to dump them onto separate pages of a PDF document, say. Following on from the previous example (saved as the lattice object

):`p`devAskNewPage(TRUE) # force prompt between each page update(p, layout=c(2, 1, 2)) # 2 cols; 1 row; 2 pages devAskNewPage(FALSE) # restore default

- You can see examples of a timeseries and dotplot created with lattice, together with the R code that produced them in the R gallery page.

ggplot2 graphics

The gg

in ggplot2

refers to the book The Grammar of Graphics (which I can highly recommend), by Leland Wilkinson, which has been implemented in an R package by Hadley Wickham. This is another excellent package for multivariate data analysis in R, which is based on a grammatical

approach to graphics that provides great flexibility in design. Still under active delevelopment, the only noticeable (and slight) drawback with ggplot2 is the small delay in rendering the final plot. This reflects the fact that there's a lot going on behind the scenes in order to produce such highly polished graphics. The package has an offical website and wiki and you can find out more by reading the excellent book ggplot2 - Elegant Graphics for Data Analysis

by Hadley Wickham.

# install.packages("ggplot2") # package not one of the default R libraries require(ggplot2) # ensure package is loaded ggplot(data=A, aes(x=mass, y=sma, label=name, colour=giant, shape=giant)) + geom_point() + scale_x_log10(limits=c(1e23, 1e28)) + scale_y_log10() + geom_text(adj=-0.2, legend=FALSE) + labs(x="Planet mass (kg)", y="Semi-major axis of planet orbit (m)", colour="Type", shape="Type")

For further information, you can find out more about how to access, manipulate, summarise, plot and analyse data using R.

Also, why not check out some of the graphs and plots shown in the R gallery, with the accompanying R source code used to create them.