There is a lot of data contained in Shapefiles (ESRI formats) that we need to work with. Here we look at how to load them in, how to plot them, and how to save them.

Loading

The ShapeFile is an ubiquitous data format associated with ESRI. It is actually a collection of several file types, the most important of which include:

Extension Description
*.shp The actual shape file containing the geometries
*.dbf A database file with the data associated with the objects in the shape file
*.shx An index file
*.prj The file specifying the projection

There may be many more files that are generated as part of the shapefile1

To read in a shapefile, we’ll use the sf (Simple Features) library. Most of the functions in this library have the prefix st_ to mimic the spatial operations used in PostGIS, so if you make the leap to working with a Postgres/PostGIS spatial database, you’ll feel right at home.

## Reading layer `structures' from data source `/Users/rodney/Desktop/DLab-Spatial/data/structures.shp' using driver `ESRI Shapefile'
## Simple feature collection with 8134 features and 5 fields
## geometry type:  POLYGON
## dimension:      XY
## bbox:           xmin: -77.46736 ymin: 37.52097 xmax: -77.4254 ymax: 37.55161
## epsg (SRID):    4326
## proj4string:    +proj=longlat +datum=WGS84 +no_defs

Automatically, the meta data about the object is presented. Notice that the structures object is actually a mix of two classes

## [1] "sf"         "data.frame"

making them rather easy to use in the normal workflow.

##  SubType                                    GlobalID      RuleID_DS     
##  1:6280   {0001D58D-128C-4968-ABEC-AB99D5B1FBBB}:   1   Min.   :0.0000  
##  2:  18   {0009F83B-E6F1-468D-A0D2-DDC0F505FF9A}:   1   1st Qu.:1.0000  
##  3:1513   {001995D4-340F-49E2-B4C6-458A1F994535}:   1   Median :1.0000  
##  4:  17   {002A31BF-D7A2-4E01-A003-81404B34AB6A}:   1   Mean   :0.9962  
##  5:   5   {002CB913-0A0A-42DA-9943-E9C2B4A43EA6}:   1   3rd Qu.:1.0000  
##  6:   9   {002E3670-AC3A-4875-8FAF-DD57FF3DBE58}:   1   Max.   :1.0000  
##  7: 292   (Other)                               :8128                   
##    Shape_Leng        Shape_Area                geometry   
##  Min.   :  14.54   Min.   :     5.7   POLYGON      :8134  
##  1st Qu.:  63.81   1st Qu.:   219.4   epsg:4326    :   0  
##  Median : 132.90   Median :   870.2   +proj=long...:   0  
##  Mean   : 170.26   Mean   :  2610.0                       
##  3rd Qu.: 185.29   3rd Qu.:  1519.3                       
##  Max.   :4308.17   Max.   :360331.7                       
## 

You can use functions from the dplyr library to filter the contents of sf/data.frame objects. Here is an example where I plot the structures of SubType==1 as well as paved roads and different water sources.

## Reading layer `roads' from data source `/Users/rodney/Desktop/DLab-Spatial/data/roads.shp' using driver `ESRI Shapefile'
## Simple feature collection with 318 features and 4 fields
## geometry type:  MULTIPOLYGON
## dimension:      XY
## bbox:           xmin: -77.46736 ymin: 37.52097 xmax: -77.4254 ymax: 37.55161
## epsg (SRID):    4326
## proj4string:    +proj=longlat +datum=WGS84 +no_defs
## Reading layer `water' from data source `/Users/rodney/Desktop/DLab-Spatial/data/water.shp' using driver `ESRI Shapefile'
## Simple feature collection with 9 features and 6 fields
## geometry type:  POLYGON
## dimension:      XY
## bbox:           xmin: -77.46736 ymin: 37.52409 xmax: -77.4254 ymax: 37.5352
## epsg (SRID):    4326
## proj4string:    +proj=longlat +datum=WGS84 +no_defs


  1. I know, it is kind of a mis-nomer to call it a shapefile if it is a collection of files AND most problematically, it basically violates most rules of Reproducible Research. But precedence is precedence.