X Tutup
--- title: "Working with Spatial Data" --- ```{r, echo=FALSE, message=FALSE, results='hide', purl=FALSE} source("knitr_header.R") ``` [ The R Script associated with this page is available here](`r output`). Download this file and open it (or copy-paste into a new script) with RStudio so you can follow along. # Setup ## Load packages ```{r,messages=F,warning=F, results="hide"} library(sp) library(rgdal) library(ggplot2) library(dplyr) library(tidyr) library(maptools) ``` # Point data ## Generate some random data ```{r} coords = data.frame( x=rnorm(100), y=rnorm(100) ) str(coords) ``` ```{r} plot(coords) ``` ## Convert to `SpatialPoints` ```{r} sp = SpatialPoints(coords) str(sp) ``` ## Create a `SpatialPointsDataFrame` First generate a dataframe (analagous to the _attribute table_ in a shapefile) ```{r} data=data.frame(ID=1:100,group=letters[1:20]) head(data) ``` Combine the coordinates with the data ```{r} spdf = SpatialPointsDataFrame(coords, data) spdf = SpatialPointsDataFrame(sp, data) str(spdf) ``` Note the use of _slots_ designated with a `@`. See `?slot` for more. ## Promote a data frame with `coordinates()` ```{r} coordinates(data) = cbind(coords$x, coords$y) ``` ```{r} str(spdf) ``` ## Subset data ```{r} subset(spdf, group=="a") ``` Or using `[]` ```{r} spdf[spdf$group=="a",] ``` Unfortunately, `dplyr` functions do not directly filter spatial objects.
## Your turn Convert the following `data.frame` into a SpatialPointsDataFrame using the `coordinates()` method and then plot the points with `plot()`. ```{r} df=data.frame( lat=c(12,15,17,12), lon=c(-35,-35,-32,-32), id=c(1,2,3,4)) ``` ```{r,purl=F,echo=F} knitr::kable(df) ```
```{r, purl=F} coordinates(df)=c("lon","lat") plot(df) ```
## Examine topsoil quality in the Meuse river data set ```{r} ## Load the data data(meuse) str(meuse) ```
## Your turn _Promote_ the `meuse` object to a spatial points data.frame with `coordinates()`.
```{r,purl=F} coordinates(meuse) <- ~x+y # OR coordinates(meuse)=cbind(meuse$x,meuse$y) # OR coordinates(meuse))=c("x","y") str(meuse) ```
Plot it with ggplot: ```{r, fig.height=4} ggplot(as.data.frame(meuse),aes(x=x,y=y))+ geom_point(col="red")+ coord_equal() ``` Note that `ggplot` works only with data.frames. Convert with `as.data.frame()` or `fortify()`. # Lines ### A `Line` is a single chain of points. ```{r} L1 = Line(cbind(rnorm(5),rnorm(5))) L2 = Line(cbind(rnorm(5),rnorm(5))) L3 = Line(cbind(rnorm(5),rnorm(5))) L1 ``` ```{r} plot(coordinates(L1),type="l") ``` ### A `Lines` object is a list of chains with an ID ```{r} Ls1 = Lines(list(L1),ID="a") Ls2 = Lines(list(L2,L3),ID="b") Ls2 ``` ### A `SpatialLines` is a list of Lines ```{r} SL12 = SpatialLines(list(Ls1,Ls2)) plot(SL12) ``` ### A `SpatialLinesDataFrame` is a `SpatialLines` with a matching `DataFrame` ```{r} SLDF = SpatialLinesDataFrame( SL12, data.frame( Z=c("road","river"), row.names=c("a","b") )) str(SLDF) ``` # Polygons ## Getting complicated alt text ### Issues * Multipart Polygons * Holes Rarely construct _by hand_... # Importing data But, you rarely construct data _from scratch_ like we did above. Usually you will import datasets created elsewhere. ## Geospatial Data Abstraction Library ([GDAL](gdal.org)) `rgdal` package for importing/exporting/manipulating spatial data: * `readOGR()` and `writeOGR()`: Vector data * `readGDAL()` and `writeGDAL()`: Raster data Also the `gdalUtils` package for reprojecting, transforming, reclassifying, etc. List the file formats that your installation of rgdal can read/write with `ogrDrivers()`: ```{r, echo=F} knitr::kable(ogrDrivers()) ``` Now as an example, let's read in a shapefile that's included in the `maptools` package. You can try ```{r} ## get the file path to the files file=system.file("shapes/sids.shp", package="maptools") ## get information before importing the data ogrInfo(dsn=file, layer="sids") ## Import the data sids <- readOGR(dsn=file, layer="sids") summary(sids) plot(sids) ``` ### Maptools package The `maptools` package has an alternative function for importing shapefiles that can be a little easier to use (but has fewer options). * `readShapeSpatial` ```{r} sids <- readShapeSpatial(file) ``` ### Raster data We'll deal with raster data in the next section. # Coordinate Systems * Earth isn't flat * But small parts of it are close enough * Many coordinate systems exist * Anything `Spatial*` (or `raster*`) can have one ## Specifying the coordinate system ### The [Proj.4](https://trac.osgeo.org/proj/) library Library for performing conversions between cartographic projections. See [http://spatialreference.org](http://spatialreference.org) for information on specifying projections. For example, #### Specifying coordinate systems **WGS 84**: * proj4:
`+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs` * .prj / ESRI WKT: `GEOGCS["GCS_WGS_1984",DATUM["D_WGS_1984",`
` SPHEROID["WGS_1984",6378137,298.257223563]],`
` PRIMEM["Greenwich",0],UNIT["Degree",0.017453292519943295]]`
* EPSG:`4326` Note that it has no projection information assigned (since it came from a simple data frame). From the help file (`?meuse`) we can see that the projection is EPSG:28992. ```{r} proj4string(sids) <- CRS("+proj=longlat +ellps=clrk66") proj4string(sids) ``` ## Spatial Transform Assigning a CRS doesn't change the projection of the data, it just indicates which projection the data are currently in. So assigning the wrong CRS really messes things up. Transform (_warp_) projection from one to another with `spTransform` Project the `sids` data to the US National Atlas Equal Area (Lambert azimuthal equal-area projection): ```{r} sids_us = spTransform(sids,CRS("+proj=laea +lat_0=45 +lon_0=-100 +x_0=0 +y_0=0 +a=6370997 +b=6370997 +units=m +no_defs")) ``` Compare the _bounding box_: ```{r} bbox(sids) bbox(sids_us) ``` And plot them: ```{r} # Geographic ggplot(fortify(sids),aes(x=long,y=lat,order=order,group=group))+ geom_polygon(fill="white",col="black")+ coord_equal() # Equal Area ggplot(fortify(sids_us),aes(x=long,y=lat,order=order,group=group))+ geom_polygon(fill="white",col="black")+ coord_equal()+ ylab("Northing")+xlab("Easting") ``` # RGEOS Interface to Geometry Engine - Open Source (GEOS) using a C API for topology operations (e.g. union, simplification) on geometries (lines and polygons). ```{r} library(rgeos) ``` ## RGEOS package for polygon operations * Area calculations (`gArea`) * Centroids (`gCentroid`) * Convex Hull (`gConvexHull`) * Intersections (`gIntersection`) * Unions (`gUnion`) * Simplification (`gSimplify`) If you have trouble installing `rgeos` on OS X, look [here](http://dyerlab.bio.vcu.edu/2015/03/31/install-rgeos-on-osx/) ## Example: gSimplify Make up some lines and polygons: ```{r} p = readWKT(paste("POLYGON((0 40,10 50,0 60,40 60,40 100,50 90,60 100,60", "60,100 60,90 50,100 40,60 40,60 0,50 10,40 0,40 40,0 40))")) l = readWKT("LINESTRING(0 7,1 6,2 1,3 4,4 1,5 7,6 6,7 4,8 6,9 4)") ``` ### Simplication of lines ```{r} par(mfrow=c(1,4)) # this sets up a 1x4 grid for the plots plot(l);title("Original") plot(gSimplify(l,tol=3));title("tol: 3") plot(gSimplify(l,tol=5));title("tol: 5") plot(gSimplify(l,tol=7));title("tol: 7") ``` ### Simplification of polygons ```{r} par(mfrow=c(1,4)) # this sets up a 1x4 grid for the plots plot(p);title("Original") plot(gSimplify(p,tol=10));title("tol: 10") plot(gSimplify(p,tol=20));title("tol: 20") plot(gSimplify(p,tol=25));title("tol: 25") ``` ## Use `rgeos` functions with real spatial data Load the `sids` data again ```{r} file = system.file("shapes/sids.shp", package="maptools") sids = readOGR(dsn=file, layer="sids") ``` ## Simplify polygons with RGEOS ```{r} sids2=gSimplify(sids,tol = 0.2,topologyPreserve=T) ``` ### Plotting vectors with ggplot `fortify()` in `ggplot` useful for converting `Spatial*` objects into plottable data.frames. ```{r} sids%>% fortify()%>% head() ``` To use `ggplot` with a `fortify`ed spatial object, you must specify `aes(x=long,y=lat,order=order, group=group)` to indicate that each polygon should be plotted separately. ```{r} ggplot(fortify(sids),aes(x=long,y=lat,order=order, group=group))+ geom_polygon(lwd=2,fill="grey",col="blue")+ coord_map() ``` Now let's overlay the simplified version to see how they differ. ```{r} ggplot(fortify(sids),aes(x=long,y=lat,order=order, group=group))+ geom_polygon(lwd=2,fill="grey",col="blue")+ geom_polygon(data=fortify(sids2),col="red",fill=NA)+ coord_map() ``` How does changing the tolerance (`tol`) affect the map? ## Calculate area with `gArea` ```{r} sids$area=gArea(sids,byid = T) ``` ### Plot a chloropleth of area From [Wikipedia](https://en.wikipedia.org/wiki/Choropleth_map): > A **choropleth** (from Greek χώρο ("area/region") + πλήθος ("multitude")) is a thematic map in which areas are shaded or patterned in proportion to the measurement of the statistical variable being displayed on the map, such as population density or per-capita income. By default, the rownames in the dataframe are the unique identifier (e.g. the **FID**) for the polygons. ```{r, fig.height=4} ## add the ID to the dataframe itself for easier indexing sids$id=as.numeric(rownames(sids@data)) ## create fortified version for plotting with ggplot() fsids=fortify(sids,region="id") ggplot(sids@data, aes(map_id = id)) + expand_limits(x = fsids$long, y = fsids$lat)+ scale_fill_gradientn(colours = c("grey","goldenrod","darkgreen","green"))+ coord_map()+ geom_map(aes(fill = area), map = fsids) ``` ## Union Merge sub-geometries (polygons) together with `gUnionCascaded()` ```{r} sids_all=gUnionCascaded(sids) ``` ```{r} ggplot(fortify(sids_all),aes(x=long,y=lat,group=group,order=order))+ geom_path()+ coord_map() ``` ## Colophon See also: `Raster` package for working with raster data Sources: * [UseR 2012 Spatial Data Workshop](http://www.maths.lancs.ac.uk/~rowlings/Teaching/UseR2012/index.html) by Barry Rowlingson Licensing: * Presentation: [CC-BY-3.0 ](http://creativecommons.org/licenses/by/3.0/us/) * Source code: [MIT](http://opensource.org/licenses/MIT)
X Tutup