X Tutup
--- title: "Getting Graphic" subtitle: "Base graphics and `ggplot2`" --- ```{r, echo=FALSE, message=FALSE, results='hide', purl=FALSE} source("knitr_header.R") ```

It appears you don't have a PDF plugin for this browser. No biggie... you can click here to download the PDF file.

Download the PDF of the presentation

[ The R Script associated with this page is available here](`r output`). Download this file and open it (or copy-paste into a new script) with RStudio so you can follow along. ## Data In this module, we'll primarily use the `mtcars` data object. The data was extracted from the 1974 Motor Trend US magazine, and comprises fuel consumption and 10 aspects of automobile design and performance for 32 automobiles (1973–74 models). A data frame with 32 observations on 11 variables. | Column name | Description | |:--------------|:------------------------------------------| | mpg | Miles/(US) gallon | | cyl | Number of cylinders | | disp | Displacement (cu.in.) | | hp | Gross horsepower | | drat | Rear axle ratio | | wt | Weight (lb/1000) | | qsec | 1/4 mile time | | vs | V/S | | am | Transmission (0 = automatic, 1 = manual) | | gear | Number of forward gears | | carb | Number of carburetors | Here's what the data look like: ```{r,warning=F} library(ggplot2);library(knitr) kable(head(mtcars)) ``` # Base graphics ## Base `plot()` R has a set of 'base graphics' that can do many plotting tasks (scatterplots, line plots, histograms, etc.) ```{r} plot(y=mtcars$mpg,x=mtcars$wt) ``` Or you can use the more common *formula* notation: ```{r} plot(mpg~wt,data=mtcars) ``` And you can customize with various parameters: ```{r} plot(mpg~wt,data=mtcars, ylab="Miles per gallon (mpg)", xlab="Weight (1000 pounds)", main="Fuel Efficiency vs. Weight", col="red" ) ``` Or switch to a line plot: ```{r} plot(mpg~wt,data=mtcars, type="l", ylab="Miles per gallon (mpg)", xlab="Weight (1000 pounds)", main="Fuel Efficiency vs. Weight", col="blue" ) ``` See `?plot` for details. ## Histograms Check out the help for basic histograms. ```{r,results='hide'} ?hist ``` Plot a histogram of the fuel efficiencies in the `mtcars` dataset. ```{r} hist(mtcars$mpg) ``` # [`ggplot2`](http://ggplot2.org) The _grammar of graphics_: consistent aesthetics, multidimensional conditioning, and step-by-step plot building. 1. Data: The raw data 2. `geom_`: The geometric shapes representing data 3. `aes()`: Aesthetics of the geometric and statistical objects (color, size, shape, and position) 4. `scale_`: Maps between the data and the aesthetic dimensions ``` data + geometry, + aesthetic mappings like position, color and size + scaling of ranges of the data to ranges of the aesthetics ``` ### Additional settings 5. `stat_`: Statistical summaries of the data that can be plotted, such as quantiles, fitted curves (loess, linear models), etc. 6. `coord_`: Transformation for mapping data coordinates into the plane of the data rectangle 7. `facet_`: Arrangement of data into grid of plots 8. `theme`: Visual defaults (background, grids, axes, typeface, colors, etc.) For example, a simple scatterplot: alt text Add variable colors and sizes: alt text ## Simple scatterplot First, create a *blank* ggplot object with the data and x-y geometry set up. ```{r} p <- ggplot(mtcars, aes(x=wt, y=mpg)) summary(p) p ``` ```{r} p + geom_point() ``` Or you can do both at the same time: ```{r} ggplot(mtcars, aes(x=wt, y=mpg)) + geom_point() ``` ### Aesthetic map: color by # of cylinders ```{r} p + geom_point(aes(colour = factor(cyl))) ``` ### Set shape using # of cylinders ```{r} p + geom_point(aes(shape = factor(cyl))) ``` ### Adjust size by `qsec` ```{r} p + geom_point(aes(size = qsec)) ``` ### Color by cylinders and size by `qsec` ```{r} p + geom_point(aes(colour = factor(cyl),size = qsec)) ``` ### Multiple aesthetics ```{r,fig.height=4} p + geom_point(aes(colour = factor(cyl),size = qsec,shape=factor(gear))) ``` ### Add a linear model ```{r} p + geom_point() + geom_smooth(method="lm") ``` ### Add a LOESS smooth ```{r} p + geom_point() + geom_smooth(method="loess") ``` ### Change scale color ```{r} p + geom_point(aes(colour = cyl)) + scale_colour_gradient(low = "blue") ``` ### Change scale shapes ```{r} p + geom_point(aes(shape = factor(cyl))) + scale_shape(solid = FALSE) ``` ### Set aesthetics to fixed value ```{r} ggplot(mtcars, aes(wt, mpg)) + geom_point(colour = "red", size = 3) ``` ### Transparancy: alpha=0.2 ```{r} d <- ggplot(diamonds, aes(carat, price)) d + geom_point(alpha = 0.2) ``` Varying alpha useful for large data sets ### Transparancy: alpha=0.1 ```{r} d + geom_point(alpha = 0.1) ``` ### Transparancy: alpha=0.01 ```{r} d + geom_point(alpha = 0.01) ``` ## Building ggplots alt text ## Other Plot types alt text
Edit plot `p` to include: 1. points 2. A smooth ('loess') curve 3. a "rug" to the plot ``` p <- ggplot(mtcars, aes(x=wt, y=mpg)) ```
```{r,message=F, purl=F} p+ geom_point()+ geom_smooth()+ geom_rug() ```
alt text alt text alt text ### Discrete X, Continuous Y ```{r} p <- ggplot(mtcars, aes(factor(cyl), mpg)) p + geom_point() ``` ### Discrete X, Continuous Y + geom_jitter() ```{r} p + geom_jitter() ``` ### Discrete X, Continuous Y + geom_violin() ```{r} p + geom_violin() ``` ### Discrete X, Continuous Y + geom_violin() ```{r} p + geom_violin() + geom_jitter(position = position_jitter(width = .1)) ``` alt text ### Three Variables alt text Will return to this when we start working with raster maps. ### Stats Visualize a data transformation alt text * Each stat creates additional variables with a common ``..name..`` syntax * Often two ways: `stat_bin(geom="bar")` OR `geom_bar(stat="bin")` alt text ### 2D kernel density estimation Old Faithful Geyser Data on duration and waiting times. ```{r} library("MASS") data(geyser) m <- ggplot(geyser, aes(x = duration, y = waiting)) ``` alt text [photo: Greg Willis](https://commons.wikimedia.org/wiki/File:Old_Faithful_(3679482556).jpg) See `?geyser` for details. ```{r} m + geom_point() ``` ```{r} m + geom_point() + stat_density2d(geom="contour") ``` Check `?geom_density2d()` for details ```{r} m + geom_point() + stat_density2d(geom="contour") + xlim(0.5, 6) + ylim(40, 110) ``` Update limits to show full contours. Check `?geom_density2d()` for details ```{r} m + stat_density2d(aes(fill = ..level..), geom="polygon") + geom_point(col="red") ``` Check `?geom_density2d()` for details alt text
## Your turn Edit plot `m` to include: * The point data (with red points) on top * A `binhex` plot of the Old Faithful data Experiment with the number of bins to find one that works. See `?stat_binhex` for details. ``` #install.packages("hexbin") library(hexbin) m <- ggplot(geyser, aes(x = duration, y = waiting)) ```
```{r, purl=F} m + stat_binhex(bins=10) + geom_point(col="red") ```
## Specifying Scales alt text ### Discrete color: default ```{r} b=ggplot(mpg,aes(fl))+ geom_bar( aes(fill = fl)); b ``` ### Discrete color: greys ```{r} b + scale_fill_grey( start = 0.2, end = 0.8, na.value = "red") ``` ### Continuous color: defaults ```{r, message=F} a <- ggplot(mpg, aes(x=hwy,y=cty,col=displ)) + geom_point(); a ``` ### Continuous color: `gradient` ```{r, message=F} a + scale_color_gradient( low = "red", high = "yellow") ``` ### Continuous color: `gradient2` ```{r, message=F} a + scale_color_gradient2(low = "red", high = "blue", mid = "white", midpoint = 4) ``` ### Continuous color: `gradientn` ```{r, message=F} a + scale_color_gradientn( colours = rainbow(10)) ``` ### Discrete color: brewer ```{r} b + scale_fill_brewer( palette = "Blues") ``` ## [colorbrewer2.org](http://colorbrewer2.org) alt text ## ColorBrewer: Diverging alt text ## ColorBrewer: Filtered alt text
## Your turn Edit the contour plot of the geyser data: 1. Reduce the size of the points 2. Use a sequential brewer palette (select from [colorbrewer2.org](http://colorbrewer2.org)) 3. Add informative x and y labels ``` m <- ggplot(geyser, aes(x = duration, y = waiting)) + stat_density2d(aes(fill = ..level..), geom="polygon") + geom_point(col="red") ``` Note: `scale_fill_distiller()` rather than `scale_fill_brewer()` for continuous data
```{r, purl=F} m + stat_density2d(aes(fill = ..level..), geom="polygon") + geom_point(size=.75)+ scale_fill_distiller(palette="OrRd", name="Kernel\nDensity")+ xlim(0.5, 6) + ylim(40, 110)+ xlab("Eruption Duration (minutes)")+ ylab("Waiting time (minutes)") ``` Or use `geom=tile` for a raster representation. ```{r, purl=F} m + stat_density2d(aes(fill = ..density..), geom="tile",contour=F) + geom_point(size=.75)+ scale_fill_distiller(palette="OrRd", name="Kernel\nDensity")+ xlim(0.5, 6) + ylim(40, 110)+ xlab("Eruption Duration (minutes)")+ ylab("Waiting time (minutes)") ```
## Axis scaling Create noisy exponential data ```{r} set.seed(201) n <- 100 dat <- data.frame( xval = (1:n+rnorm(n,sd=5))/20, yval = 10^((1:n+rnorm(n,sd=5))/20) ) ``` Make scatter plot with regular (linear) axis scaling ```{r} sp <- ggplot(dat, aes(xval, yval)) + geom_point() sp ``` Example from [R Cookbook](http://www.cookbook-r.com/Graphs/Axes_(ggplot2)/) log10 scaling of the y axis (with visually-equal spacing) ```{r} sp + scale_y_log10() ``` ## Coordinate Systems alt text ## Position alt text ### Stacked bars ```{r} ggplot(diamonds, aes(clarity, fill=cut)) + geom_bar() ``` ### Dodged bars ```{r} ggplot(diamonds, aes(clarity, fill=cut)) + geom_bar(position="dodge") ``` # Facets Use facets to divide graphic into *small multiples* based on a categorical variable. `facet_wrap()` for one variable: ```{r} ggplot(mpg, aes(x = cty, y = hwy, color = factor(cyl))) + geom_point()+ facet_wrap(~year) ``` `facet_grid()`: two variables ```{r} ggplot(mpg, aes(x = cty, y = hwy, color = factor(cyl))) + geom_point()+ facet_grid(year~cyl) ``` *Small multiples* (via facets) are very useful for visualization of timeseries (and especially timeseries of spatial data.) # Themes Set *default* display parameters (colors, font sizes, etc.) for different purposes (for example print vs. presentation) using themes. ## GGplot Themes alt text Quickly change plot appearance with themes. ### More options in the `ggthemes` package. ```{r} library(ggthemes) ``` Or build your own! ### Theme examples: default ```{r} p=ggplot(mpg, aes(x = cty, y = hwy, color = factor(cyl))) + geom_jitter() + labs( x = "City mileage/gallon", y = "Highway mileage/gallon", color = "Cylinders" ) ``` ### Theme examples: default ```{r} p ``` ### Theme examples: Solarized ```{r} p + theme_solarized() ``` ### Theme examples: Solarized Dark ```{r} p + theme_solarized(light=FALSE) ``` ### Theme examples: Excel ```{r} p + theme_excel() ``` ### Theme examples: _The Economist_ ```{r} p + theme_economist() ``` ## Theme examples: _XKCD_ XKCD: A webcomic of romance, sarcasm, math, and language. alt text Note: the following code will only work if you have the xkcd font installed. See `xkcd::vignette("xkcd-intro")` for details. ```{r, warning=FALSE, message=F} library(xkcd) ggplot(mtcars, aes(mpg, wt)) + geom_point() + geom_smooth()+ ylab("Weight")+xlab("Miles per Gallon")+ theme_xkcd() ``` # Saving/exporting ## Saving using the GUI alt text ## Saving using `ggsave()` Save a `ggplot` with sensible defaults: ```{r,eval=F} ggsave(filename, plot = last_plot(), scale = 1, width, height) ``` ## Saving using devices Save any plot with maximum flexibility: ```{r,eval=F} pdf(filename, width, height) # open device ggplot() # draw the plot(s) dev.off() # close the device ``` **Formats** * pdf * jpeg * png * tif and more... # Today's task Now [complete the first task here](CS_02.html) by yourself or in small groups. ## GGPLOT2 Documentation Perhaps R's best documented package: [docs.ggplot2.org](http://docs.ggplot2.org/current/) alt text ## Colophon Sources: * [ggplot cheatsheet](https://www.rstudio.com/wp-content/uploads/2015/03/ggplot2-cheatsheet.pdf) Licensing: * Presentation: [CC-BY-3.0 ](http://creativecommons.org/licenses/by/3.0/us/) * Source code: [MIT](http://opensource.org/licenses/MIT)
X Tutup