forked from AdamWilsonLabEDU/SpatialDataScience
-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy path04_Presentation.Rmd
More file actions
600 lines (417 loc) · 17.5 KB
/
04_Presentation.Rmd
File metadata and controls
600 lines (417 loc) · 17.5 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
---
title: "Spatial data handling"
output:
rmdshower::shower_presentation:
theme: material
---
```{r, echo=F,message=F,warning=F}
library(methods)
library(rgdal)
library(rgeos)
library(sp)
library(sf)
library(tidyverse)
library(spData)
```
---
## Spatial data handling in R
Packages:
* `sp` First major spatial data package/format
* `rgdal` reading and writing spatial data
* `rgeos` Interface to open-source geometry engine (GEOS)
* `sf` Spatial Features in the 'tidyverse'
* `raster` raster (gridded) data
* and a few others...
---
## What is a Spatial Feature (sf)?
Typically an object in the real world, such as a building or a tree.
A forest stand can be a feature, a forest can be a feature, a city can be a feature.
A satellite image pixel can be a feature, a complete image can be
a feature too.
---
```{r, echo=F}
px <- c(5, 7, 8, 9, 8, 7, 6)
py <- c(7, 3, 4, 8, 9, 15, 14)
plot(px, py, type="n", axes=F, xlab = '', ylab = '')
polygon(px, py, col = "khaki1")
points(c(6, 9, 8, 8.5), c(9, 14, 8, 9), pch=20, col = "peachpuff4", lwd = 3)
lines(c(5, 6, 7, 8), c(5, 6,10, 11), col = "steelblue1", lwd = 3)
lines(c(8, 9), c(14, 12), col = "dark green", lwd = 3)
```
---
## Spatial Features
What information do we need to store in order to define points, lines, polygons in geographic space?
- lat/lon coordinates
- projection
- what type (point/line/poly)
- if polygon, is it a hole or not
- attribute data
- ... ?
---
## Geometry
Features have a _geometry_ describing _where_ on Earth the
feature is located, and they have attributes, which describe other
properties.
A Tree:
* delineation of its crown
* its stem
* point indicating its centre
Attributes:
* species,
* height
* diameter at breast height at a particular date, and so on.
---
## Spatial Feature Standard
"_A simple feature is defined by the OpenGIS Abstract specification to have both spatial and non-spatial attributes. Spatial attributes are geometry valued, and simple features are based on 2D geometry with linear interpolation between vertices._"
---
## Dimensions
All geometries composed of points. Points are coordinates in a
2-, 3- or 4-dimensional space. All points in a geometry have the
same dimensionality. In addition to X and Y coordinates, there are
two optional additional dimensions:
* a Z coordinate, denoting altitude
* an M coordinate (rarely used), denoting some _measure_ that is associated with the point, rather than with the feature as a whole (in which case it would be a feature attribute); examples could be time of measurement, or measurement error of the coordinates
---
## Dimensions
The four possible cases then are:
1. two-dimensional points refer to x and y, easting and northing, or longitude and latitude, we refer to them as XY
2. three-dimensional points as XYZ
3. three-dimensional points as XYM
4. four-dimensional points as XYZM (the third axis is Z, fourth M)
---
## Simple feature geometry types
The following seven simple feature types are the most common, and are for instance the only ones used for [GeoJSON](https://tools.ietf.org/html/rfc7946):
| type | description |
| ---- | ----------- |
| `POINT` | a single point |
| `LINESTRING` | sequence of points connected by lines |
| `POLYGON` | sequence of points form a closed ring |
| `MULTIPOINT` | set of points |
| `MULTILINESTRING` | set of linestrings |
| `MULTIPOLYGON` | set of polygons |
| `GEOMETRYCOLLECTION` | set of geometries |
---
## Uncommon Geometry Types
10 more geometries 10 are rare:
* `CIRCULARSTRING`
* `COMPOUNDCURVE`
* `CURVEPOLYGON`
* `MULTICURVE`
* `MULTISURFACE`
* `CURVE`
* `SURFACE`
* `POLYHEDRALSURFACE`
* `TIN`
* `TRIANGLE`
---
## Coordinate reference system
Coordinates can only be placed on the Earth's surface when their coordinate
reference system (CRS) is known; this may be an elipsoidal CRS such as WGS84, a projected, two-dimensional (Cartesian) CRS such as a UTM zone or Web Mercator, or a CRS
in three-dimensions or [including time](http://www.faculty.jacobs-university.de/pbaumann/iu-bremen.de_pbaumann/Papers/acmgis-2012_crs-nts.pdf). Similarly, M-coordinates need an attribute reference
system, e.g. a [measurement unit](https://CRAN.R-project.org/package=units).
---
## How simple features in R are organized
Package `sf` represents simple features as native R objects.
Similar to [PostGIS](http://postgis.net/), all functions and methods
in `sf` that operate on spatial data are prefixed by `st_`, which
refers to _spatial and temporal_; this makes them easily findable
by command-line completion. Simple features are implemented as
R native data, using simple data structures (S3 classes, lists,
matrix, vector). Typical use involves reading, manipulating and
writing of sets of features, with attributes and geometries.
---
Attributes typically stored in `data.frame` objects (or the
very similar `tbl_df`), we will also store feature geometries in
a `data.frame` column. Since geometries are not single-valued,
they are put in a list-column, a list of length equal to the number
of records in the `data.frame`, with each list element holding the simple
feature geometry of that feature. The three classes used to
represent simple features are:
---
## Components
* `sf`, the table (`data.frame`) with feature attributes and feature geometries, which contains
* `sfc`, the list-column with the geometries for each feature (record), which is composed of
* `sfg`, the feature geometry of an individual simple feature.
---
There are currently two main approaches in R to handle geographic vector data.
---
# `sp` package
---
First package to provide classes and methods for spatial data types in R is called [`sp`](https://cran.r-project.org/package=sp). Provides classes and methods to create _points_, _lines_, _polygons_, and _grids_ and to operate on them.
About 350 of the spatial analysis packages use the spatial data types that are implemented in `sp` i.e. they "depend" on the `sp` package and many more are indirectly dependent.
---
Many spatial R packages still depends on the **sp** package, therefore it is important to know how to convert **sp** to and from **sf** objects
---
Foundational structure for any spatial object in `sp` is the `Spatial` class. It has two "slots" ([new-style S4 class objects in R have pre-defined components called slots](http://stackoverflow.com/a/4714080)):
* a __bounding box__
* a __CRS class object__ to define the Coordinate Reference System
Basic structure extended depending on the characteristics of the spatial object (point, line, polygon).
---
## I. Create geometric objects (topology)
* __Points__ most basic spatial data objects. Typically from a two-column matrix/dataframe with a column for latitude and one for longitude.
* __Lines__ are generated out of `Line` objects. A `Line` object is a collection of 2D coordinates and is generated out of a two-column matrix or a dataframe with a column for latitude and one for longitude. A `Lines` object is a __list__ of one or more `Line` objects.
* __Polygons__ are generated out of `Polygon` objects. A `Polygon` object is a collection of 2D coordinates with equal first and last coordinates and is generated out of a two-column matrix or a dataframe with a column for latitude and one for longitude. A `Polygons` object is a __list__ of one or more `Polygon` objects, for example islands belonging to the same country.
---
A very simple `Line` object:
```{r}
ln <- Line(matrix(runif(6), ncol=2))
str(ln)
ln@coords
```
---
A very simple `Lines` object:
```{r}
lns <- Lines(list(ln,ln), ID = "a") # this contains just two (identical) Lines!
str(lns)
```
---
## II. Create spatial objects `Spatial*` object (`*` stands for Points, Lines, or Polygons).
Adds bounding box and the slot for the Coordinate Reference System or CRS. `SpatialPoints` can be directly generated out of the coordinates. `SpatialLines` and `SpatialPolygons` objects are generated using lists of `Lines` or `Polygons` objects respectively (more below).
---
See here for how to create a `SpatialLines` object:
```{r}
sp_lns <- SpatialLines(list(lns))
str(sp_lns)
```
---
## III. Add attributes (_Optional_:)
Add attribute data, which will turn your `Spatial*` object into a `Spatial*DataFrame` object. ID fields are here required to match the data frame row names.
---
Create a simple `SpatialLinesDataframe`:
```{r}
dfr <- data.frame(id = "a", use = "road", cars_per_hour = 10) # note how we use the ID from above!
sp_lns_dfr <- SpatialLinesDataFrame(sp_lns, dfr, match.ID = "id")
str(sp_lns_dfr)
```
---
## Commonly used spatial methods
function | and what it does
------------ | ------------------------------------------------------
`bbox()` | returns the bounding box coordinates
`proj4string()` | sets or retrieves projection attributes using the CRS object.
`CRS()` | creates an object of class of coordinate reference system arguments
`spplot()` | plots all the attributes
`coordinates()` | set or retrieve the spatial coordinates.
`over(a, b)` | used to retrieve the polygon or grid indices on a set of points
`spsample()` | sampling of spatial points within the spatial extent of objects
---
# `sf` package
---
[`sf`](https://cran.r-project.org/package=sf) implements a formal standard called ["Simple Features"](https://en.wikipedia.org/wiki/Simple_Features) that specifies a storage and access model of spatial geometries (point, line, polygon). A feature geometry is called simple when it consists of points connected by straight line pieces, and does not intersect itself.
This standard has been adopted widely, not only by spatial databases such as PostGIS, but also more recent standards such as GeoJSON.
---
If you work with PostGis or GeoJSON you may have come across the [WKT (well-known text)](https://en.wikipedia.org/wiki/Well-known_text) format, for example like these:
`POINT (30 10)
LINESTRING (30 10, 10 30, 40 40)
POLYGON ((30 10, 40 40, 20 40, 10 20, 30 10))`
`sf` implements this standard natively in R. Data are structured and conceptualized very differently from the `sp` approach.
---
In `sf` spatial objects are stored as a simple data frame with a special column that contains the information for the geographic coordinates. That special column is a list with the same length as the number of rows in the data frame. Each of the individual list elements then can be of any length needed to hold the coordinates that correspond to an individual feature.
---
## I. Create geometric objects (topology)
Geometric objects (simple features) can be created from a numeric vector, matrix or a list with the coordinates. They are called `sfg` objects for Simple Feature Geometry.
See here for an example of how a LINESTRING `sfg` object is created:
```{r}
lnstr_sfg <- st_linestring(matrix(runif(6), ncol=2))
class(lnstr_sfg)
```
---
## II. Combine all individual single feature objects for the special column.
In order to work our way towards a data frame for all features we create what is called an `sfc` object with all individual features, which stands for Simple Feature Collection. The `sfc` object also holds the bounding box and the projection information.
---
See here for an example of how a `sfc` object is created:
```{r}
(lnstr_sfc <- st_sfc(lnstr_sfg)) # just one feature here
class(lnstr_sfc)
```
---
## III. Add attributes.
We now combine the dataframe with the attributes and the simple feature collection.
See here how its done.
```{r}
(lnstr_sf <- st_sf(dfr , lnstr_sfc))
class(lnstr_sf)
```
There are many methods available in the `sf` package, to find out use `methods(class="sp")`
---
## sf Highlights
* provides **fast** I/O, particularly relevant for large files
* directly reads from and writes to spatial **databases** such as PostGIS
* compatibile with the *tidyverse*
* stay tuned for a new `ggplot` release that will be able to read and plot the `sf` format without the need of conversion to a data frame, like the `sp` format
---
`sp` and `sf` are _not_ only formats for spatial objects. Other spatial packages may use their own class definitions for spatial data (for example `spatstat`). Usuallly you can find functions that convert `sp` and increasingly `sf` objects to and from these formats.
---
```{r}
world_sp = as(world, "Spatial")
world_sf = st_as_sf(world_sp)
str(world_sp)
```
---
```{r}
str(world)
```
---
- The structures in the **sp** packages are more complicated - `str(world_sf)` vs `str(world_sp)`
- Moreover, many of the **sp** functions are not "pipeable" (it's hard to combine them with the **tidyverse**)
```{r, eval = F}
world_sp %>%
filter(name_long == "England")
```
`Error in UseMethod("filter_") :
no applicable method for 'filter_' applied to an object of class "c('SpatialPolygonsDataFrame', 'SpatialPolygons', 'Spatial', 'SpatialPolygonsNULL', 'SpatialVector')"`
---
## Reading and writing spatial data
```{r}
vector_filepath = system.file("shapes/world.gpkg", package = "spData")
vector_filepath
world = st_read(vector_filepath)
```
Counterpart to `st_read()` is the `st_write` function, e.g. `st_write(world, 'data/new_world.gpkg')`. A full list of supported formats could be found using `sf::st_drivers()`.
---
## Structure of the sf objects
```{r, eval = FALSE}
world
```
```{r, echo = FALSE}
print(world, n=3)
```
```{r}
class(world)
```
---
## Structure of the sf objects
```{r, eval=FALSE}
world$name_long
```
```{r, echo=FALSE}
world$name_long[1:3]
```
```{r, eval=FALSE}
world$geom
```
```{r, echo=FALSE}
print(world$geom, n = 3)
```
## Non-spatial operations on the sf objects
```{r, warning=FALSE}
world %>%
left_join(worldbank_df, by = "iso_a2") %>%
select(name_long, pop, pop_growth, area_km2) %>%
arrange(area_km2) %>%
mutate(pop_density = pop/area_km2) %>%
rename(population = pop)
```
---
## Non-spatial operations
```{r}
world_cont = world %>%
group_by(continent) %>%
summarize(pop_sum = sum(pop, na.rm = TRUE))
```
```{r, echo=FALSE}
print(world_cont, n = 1)
```
---
The `st_set_geometry` function can be used to remove the geometry column:
```{r}
world_df =st_set_geometry(world_cont, NULL)
class(world_df)
```
---
## Spatial operations
It's a big topic which includes:
- Spatial subsetting
- Spatial joining/aggregation
- Topological relations
- Distances
- Spatial geometry modification
- Raster operations (map algebra)
See [Chapter 4](http://robinlovelace.net/geocompr/spatial-data-operations.html#spatial-operations-on-raster-data) of *Geocomputation with R*
## CRS
Transform (warp) to a different projection:
```{r}
na_2163 = world %>%
filter(continent == "North America") %>%
st_transform(2163) #US National Atlas Equal Area
st_crs(na_2163)
```
---
## Compare projections
```{r}
na = world %>%
filter(continent == "North America")
par(mfrow = c(1, 2), mar=c(0,0,0,0))
plot(na[0]);plot(na_2163[0])
```
---
## The `world` dataset
```{r}
plot(na_2163)
```
---
## Spatial operations
```{r, warning = FALSE, message = FALSE, fig.height = 4}
canada = na_2163 %>%
filter(name_long=="Canada")
canada_buffer=canada%>%
st_buffer(500000) %>%
filter()
plot(select(canada,geom))
plot(select(canada_buffer,geom),add=T,border="red")
```
## Basic maps
- Basic maps of `sf` objects can be quickly created using the `plot()` function:
```{r}
plot(world[0])
```
```{r}
plot(world["pop"])
```
---
## [tmap](https://cran.r-project.org/web/packages/tmap/vignettes/tmap-nutshell.html): Thematic Maps
Syntax for creating plots is similar to that of ggplot2.
The add-on package `tmaptools` contains tool functions for reading and processing shape files.
---
```{r, fig.align='center', fig.height=4, message=FALSE}
library(tmap)
tmap_mode("plot") #check the "view"
tm_shape(world, projection="wintri") +
tm_polygons("lifeExp", title=c("Life expactancy"),
style="pretty", palette="RdYlGn", auto.palette.mapping=FALSE) +
tm_style_grey()
```
---
## leaflet
```{r}
library(leaflet)
library(widgetframe)
```
```{r, eval=F}
leaflet(world) %>%
addTiles() %>%
addPolygons(color = "#444444", weight = 1, fillOpacity = 0.5,
fillColor = ~colorQuantile("YlOrRd", lifeExp)(lifeExp),
popup = paste(round(world$lifeExp, 2)))
```
```{r, echo=T, message=T,results='asis'}
l = leaflet(world) %>%
addTiles() %>%
addPolygons(color = "#444444", weight = 1, fillOpacity = 0.5,
fillColor = ~colorQuantile("YlOrRd", lifeExp)(lifeExp),
popup = paste(round(world$lifeExp, 2)))
saveWidget(l, file = "04_assets/leaflet.html",selfcontained = T)
```
<iframe src="04_assets/leaflet.html" width=100% height=400px></iframe>
---
## Raster data in the tidyverse
Raster data is not yet closely connected to the **tidyverse**, however:
- Some functions from the **raster** package works well in `pipes`
- You can convert vector data to the `Spatial*` form using `as(my_vector, "Spatial")`before working on raster-vector interactions
- There are some initial efforts to bring raster data closer to the **tidyverse**, such as [tabularaster](https://github.com/hypertidy/tabularaster), [sfraster](https://github.com/mdsumner/sfraster), or [fasterize](https://github.com/ecohealthalliance/fasterize)
- The development of the [stars](https://github.com/r-spatial/stars), package focused on multidimensional, large datasets should start soon. It will allow pipe-based workflows
---
## Sources
- Slides adapted from:
- "Robin Lovelace and Jakub Nowosad" draft book [_Geocomputation with R_ (to be published in 2018)](http://robinlovelace.net/geocompr/). Source code at https://github.com/robinlovelace/geocompr.
- [Claudia Engel's spatial analysis workshop](https://github.com/cengel/rspatial/blob/master/2_spDataTypes.Rmd)