Plotting galaxy positions and redshifts
-
The dataset is a list of objects around the galaxy cluster
Abell 85
, obtained with an extended NED (NASA Extragalactic Database) search with bar (pipe) separated ASCII output.url <- "http://www.sr.bham.ac.uk/~ajrs" file <- "R/datasets/a85_extended_NEDsearch.txt" A <- read.table(paste(url, file, sep="/"), sep="|", skip=20, header=TRUE) close(url(paste(url, file, sep="/"))) # close connection after use dim(A) # Show dimensions of data frame: rows columns
-
The default column names are a bit of a mouthful, so let's rename some of the ones to be used (object name, right ascension, declination & type):
colnames(A)[c(2, 3, 4, 5)] <- c("name", "ra", "dec", "type") - First, plot a bar chart showing the numbers of each type of object:
plot(A$type, log="y") # Log the Y axis, as there are many galaxies table(A$type) # table summary of the same information barplot(sort(table(A$type), decreasing=TRUE), log="y") # ordered bar chart
-
You can show boxplots of the redshifts of each object type as follows:
plot(Redshift ~ type, data=A, log="y") abline(h=0.055, col="red") # Show redshift of the cluster Abell 85
-
You can mark the redshifts of all objects identified as galaxy clusters with a
rug
of dashes on the right axis:
You can see there are a few high redshift quasars, but also that the galaxies form a very broad distribution, indicating that there is more than 1 galaxy cluster present in the dataset.rug(A$Redshift[A$type=="GClstr"], col="blue", side=4)
-
You can list the objects classified as galaxy clusters by NED:
A[A$type=="GClstr", ] levels(A$type) # list all types of object in the dataset
-
Now plot the positions of the objects, using "." as a small point marker, as there are many objects, then zoom in on the interesting region, by excluding the spatial outlier galaxies.
- make sure to leave space between "<" and "-" with negative numbers!plot(dec ~ ra, data=A, pch=".") plot(dec ~ ra, data=A, pch=".", xlim=c(10, 11), ylim=c(-10, -9)) A <- subset(A, ra > 9.5 & ra < 11.5 & dec > -10.3 & dec < -8.5) plot(dec ~ ra, data=A, pch=".")
-
Now, let's focus on the galaxies only:
G <- subset(A, type=="G")
-
and further limit ourselves to those galaxies with measured redshifts below 0.3 (when a value is missing in the input table, R assigns it the special value of
NA
; see?NA
for more information)G <- subset(G, !is.na(Redshift) & Redshift < 0.2) G[0] # shows no. of rows in data frame
-
A histogram of the measured galaxy redshifts shows the main Abell 85 cluster peak around 0.05 and a long tail up to z~0.25, which can be more clearly seen in a smoothed density plot.
hist(G$Redshift) plot(density(G$Redshift)) rug(G$Redshift) # plot raw values as tick marks above X-axis
-
Now let's create a new column in the data frame, containing a colour for each galaxy, based on a simple redshift cut (red for higher redshift; blue for lower):
There's some indication of an overdensity of more distant galaxies in the bottom right, but it's hard to tell with only a crude redshift cut. A nicer result can be achieved by expressing the redshift as a continously-varying colour.
G$cols <- as.character(ifelse(G$Redshift > 0.1, "red", "blue")) head(G$cols) # Show the first few colours in the column colours() # Show all pre-defined colours ("colors()" is a valid alias) plot(dec ~ ra, data=G, col=cols) # "cols" is retrieved from the data frame "G" -
To do this, you'll need to define the following functions:
remap <- function(x) ( x - min(x) ) / max( x - min(x) ) # map x onto [0, 1] fun.col <- function(x) rgb(colorRamp(c("blue", "red"))(remap(x)), maxColorValue = 255) G$cols <- with(G, fun.col(Redshift) ) head(G$cols) # colours defined by hexadecimal code, rather than predefined names plot(dec ~ ra, data=G, col=cols) -
You can also overlay a contour plot of a kernel-smoothed density estimate of the galaxy distribution, using
bkde2D
from the packageKernSmooth
.
This should produce a plot a bit like this one from the R plot gallery.require(KernSmooth) # ensure that the package is loaded #--Calculate 2-dimensional kernel-smoothed estimate: est <- bkde2D(G[c("ra", "dec")], bandwidth=c(0.07, 0.07), gridsize=c(101, 101)) #--Display as a contour map: with(est, contour(x1, x2, fhat, drawlabels=FALSE, add=TRUE)) -
Another way of displaying the velocity and position information is with a 3d
scatter plot:
It can be seen that there is a definite extended structure of galaxies lying behind the main clump in the cluster (i.e. at larger velocity).library(lattice) # Load the necessary library cloud(Velocity ~ ra + dec, data=G, cex=0.2) # small point size
Plotting some longitudinal data (galaxy cluster gas profiles)
- The dataset consists of measurements of gas entropy at a series of radii for 20 different galaxy clusters, taken from this paper. The data have already been smoothed, using the R
loess
function.url <- "http://www.sr.bham.ac.uk/~ajrs" file <- "R/datasets/cluster_entropy_profiles.txt" A <- read.table(paste(url, file, sep="/"), header=TRUE) close(url(paste(url, file, sep="/"))) # close connection after use dim(A) # Show dimensions of data frame: rows columns
- There are differing numbers of measurements for each cluster, according to data quality. You can see how many points there are for each cluster and whether the cluster has a cool core or not, as follows:
xtabs( ~ cname + cctype, data=A) # simply count number of entries for each cluster
- Now, there are several ways of plotting these data, but, given their mulitvariate nature (entropy vs. radius for each of 20 clusters), the R lattice package is ideally suited to the task, using the
xyplot
function.This treats the points as if they were all one data series, but you can use therequire(lattice) # load the package xyplot(egas.smo ~ r.kpc, data=A) # plot smoothed entropy vs. radius, in kiloparsecs xyplot(egas.smo ~ r.kpc, data=A, type="l") # plot as lines, rather than points
groups
argument to identify the separate datasets:Lattice makes it easy to add a legend, to help identify each different line, although this works best with 7 or fewer lines:xyplot(egas.smo ~ r.kpc, data=A, groups=cname, type=c("g", "l")) # plot gridlines as wellxyplot(egas.smo ~ r.kpc, data=A, groups=cname, type=c("g", "l"), auto.key=list(points=FALSE, lines=TRUE, columns=4))- Lattice also enables conditioning of the data, meaning plotting subsets defined according to some other variable or factor. For example, to see the difference between cool core and non-cool core entropy profiles:
note that both axes have been logged as well. Note how the multiple graphs are all labelled according to the values (in this case factor levels) and that the axes are all scaled identically, to enable direct comparison, with shared axes. To scale the axes independently on each sub-plot, try this:xyplot(egas.smo ~ r.kpc | cctype, data=A, groups=cname, type=c("g", "l"), auto.key=list(points=FALSE, lines=TRUE, columns=4), scales=list(log=TRUE))xyplot(egas.smo ~ r.kpc | cctype, data=A, groups=cname, type=c("g", "l"), auto.key=list(points=FALSE, lines=TRUE, columns=4), scales=list(log=TRUE, relation="free")) # Use this: 'scales=list(x=list(log=TRUE))' to log only the x axis- To inspect the profiles for individual cases, you can simply condition on
cname
:By default the subplots share common axes, but you can use separate axes for each subplot:xyplot(egas.smo ~ r.kpc | cname, data=A, groups=cctype, type=c("g", "l"), scales=list(log=TRUE), auto.key=list(points=FALSE, lines=TRUE))To control the layout of the subplots, you can supply a vector of column, row and page numbers:xyplot(egas.smo ~ r.kpc | cname, data=A, groups=cctype, type=c("g", "l"), scales=list(log=TRUE, relation="free"), auto.key=list(points=FALSE, lines=TRUE)) # Use 'x=list(relation="free")' to have only 1 axis freely scaledthe ability to span multiple pages is extremely useful, and (as far as I know) is something that ggplot2 cannot easily do. Now, if you want to arrange the sub-plots in a particular order, you can use thexyplot(egas.smo ~ r.kpc | cname, data=A, groups=cctype, type=c("g", "l"), auto.key=list(points=FALSE, lines=TRUE), scales=list(log=TRUE), layout=c(2, 5, 2)) # 2 cols, 5 rows, 2 pagesreorder
function:(Note that you can refer to the name of any column of# "cname" must be a factor for the following, which is the default data type created for # non-numeric data read in with "read.table"; you can coerce to a factor type easily using: # A$cname <- factor(A$cname) levels(A$cname) # default ordering levels(reorder(A$cname, A$clus.kT)) # sort by ascending mean cluster temperature xyplot(egas.smo ~ r.kpc | reorder(cname, clus.kT), data=A, groups=cctype, type=c("g", "l"), scales=list(log=TRUE), auto.key=list(points=FALSE, lines=TRUE))A
within the plotting function.)
By default the ordering of subplots is from the bottom left to the top right, by row, but you can change this to order from top left to bottom right (by row) using the argumentas.table=TRUE
- Alternatively you can dynamically create a new factor for conditioning, by binning up a continuous variable, using
equal.count
to createshingles
, named after (overlapping) roof tiles. This creates a specified number of bins containing (as near as) equal numbers in each bin. This be seen as follows:These can be used to group the profiles according to their mean temperature:plot(equal.count(A$clus.kT)) # split mean temperature into 6 bins plot(equal.count(A$clus.kT, number=4, overlap=0)) # 4 bins, no overlap
In this case, thexyplot(egas.smo ~ r.kpc | equal.count(clus.kT), data=A, type="l", group=cctype, scale=list(log=TRUE), auto.key=list(lines=TRUE, points=FALSE))strips
at the top of each subplot are partially shaded to show the fraction of the full range spanned (i.e. the bin widths in clus.kT here).- Finally, you can create nicer looking labels for the logged axes using the following functions, provided in the excellent book
Lattice: Multivariate Data Visualization with R
, by Deepayan Sarkar:source("http://www.sr.bham.ac.uk/~ajrs/R/scripts/lattice_functions.R") xyplot(egas.smo ~ r.kpc | equal.count(clus.kT), data=A, type="l", group=cctype, scale=list(log=TRUE), auto.key=list(lines=TRUE, points=FALSE), xscale.components=xscale.components.log10, yscale.components=yscale.components.log10)
For further information, you can find out more about how to access, manipulate, summarise, plot and analyse data using R.
- There are differing numbers of measurements for each cluster, according to data quality. You can see how many points there are for each cluster and whether the cluster has a cool core or not, as follows: