The location (physical, temporal, ecological, etc.) of an observation can is represented as a coordiante consisting of one or more values specifying the localization within a larger space. Coordinates repreenting a physical location are represented by datum embedded within a larger

Raw Data

We often start with data in raw format. For this example, we will be using the Araptus attenuata Sonoran desert bark beetle data set from the scape library. If you do not have it, install it from the Dyerlab Github account. Here is what it looks like.

##       Site      Longitude         Latitude         Males      
##  12     : 1   Min.   :-114.3   Min.   :23.29   Min.   : 9.00  
##  153    : 1   1st Qu.:-113.2   1st Qu.:24.88   1st Qu.:16.00  
##  157    : 1   Median :-112.0   Median :26.64   Median :19.00  
##  159    : 1   Mean   :-111.9   Mean   :26.42   Mean   :24.93  
##  160    : 1   3rd Qu.:-110.7   3rd Qu.:28.04   3rd Qu.:28.00  
##  161    : 1   Max.   :-109.3   Max.   :29.33   Max.   :64.00  
##  (Other):23                                                   
##     Females       Suitability     
##  Min.   : 5.00   Min.   :0.05628  
##  1st Qu.:15.00   1st Qu.:0.26730  
##  Median :21.00   Median :0.41251  
##  Mean   :23.62   Mean   :0.43527  
##  3rd Qu.:30.00   3rd Qu.:0.56413  
##  Max.   :63.00   Max.   :0.90188  
## 

These data represent:
1. Sampling sites in Baja California 2. Census of male and female beetles sampled. 3. The habitat suitability (from a Niche Model) based upon the host-plant range.

Geometric Objects

In addition to normal floating-point numbers being able to represent coordinate data such as latitude and longitude, we can also specify the data as objects recognizable for spatial databases, GIS, opensource libraries, GeoJSON, etc. These are ISO standards and if we put our data into these formats, we gain the ability to use a much wider array of tools.

For our purposes, we will use the implentation in the sf (Simple Features) library.

Most of the methods in the library are prefixed with st_ as a nod to the routine names that are used in geospatial databases (so by learning sf you are getting a leg up on how to use spatial database calls as well).

To make a point we can construct it as a Well-Known Text object (WKT) as:

## [1] "POINT (-111.7 26.14)"

or as a Well-Known Binary (WBK) as:

##  [1] 01 01 00 00 00 cd cc cc cc cc ec 5b c0 a4 70 3d 0a d7 23 3a 40

We can convert the data.frame above to one that contains these GEOMETRY objects.

## Simple feature collection with 29 features and 4 fields
## geometry type:  POINT
## dimension:      XY
## bbox:           xmin: -114.2935 ymin: 23.2855 xmax: -109.327 ymax: 29.32541
## epsg (SRID):    NA
## proj4string:    NA
## First 10 features:
##    Site Males Females Suitability                   geometry
## 1    12    24      21   0.3519050 POINT (-112.6655 27.18232)
## 2   153    35      41   0.7324870 POINT (-110.4624 24.13389)
## 3   157    26      30   0.8810290   POINT (-110.096 24.0195)
## 4   159    22      15   0.1879650 POINT (-113.3161 27.52944)
## 5   160    48      36   0.3651910 POINT (-112.5296 27.40498)
## 6   161    64      63   0.2791050   POINT (-112.986 27.0367)
## 7   162    57      41   0.6136198   POINT (-112.408 27.2028)
## 8   163    21      21   0.4328730   POINT (-110.951 24.2115)
## 9   166    19      26   0.2673030 POINT (-112.0806 25.91409)
## 10  168    28      25   0.4964650 POINT (-111.2156 25.55757)

However, notice there is no Coordinate Reference System (CRS) associated with these data.

## Coordinate Reference System: NA

The ‘coordinates’ we specified are just points, which we happened to label as “Latitude” and “Longitude” but we could have called them “X” and “Y” and unless you specify the CRS, R will pretend like it has no idea what system you are using. If we want to make them represent real coordiantes (as in Latitude and Longitude we yanked off our GPS unit or grabbed from Google Earth), we need to specify the CRS directly. See the page on projections for more information.

Here is a quick way to set a CRS using EPSG codes, an EPSG code (see EPSG.io for more information) is a shorthand notation for a coordinate reference system. There are tons of different projects available and you can make up your own if you like. For this example, I’ll use one of the most common ones we will run into, epsg:3857.

## Simple feature collection with 29 features and 4 fields
## geometry type:  POINT
## dimension:      XY
## bbox:           xmin: -114.2935 ymin: 23.2855 xmax: -109.327 ymax: 29.32541
## epsg (SRID):    3857
## proj4string:    +proj=merc +a=6378137 +b=6378137 +lat_ts=0.0 +lon_0=0.0 +x_0=0.0 +y_0=0 +k=1.0 +units=m +nadgrids=@null +wktext +no_defs
## First 10 features:
##    Site Males Females Suitability                   geometry
## 1    12    24      21   0.3519050 POINT (-112.6655 27.18232)
## 2   153    35      41   0.7324870 POINT (-110.4624 24.13389)
## 3   157    26      30   0.8810290   POINT (-110.096 24.0195)
## 4   159    22      15   0.1879650 POINT (-113.3161 27.52944)
## 5   160    48      36   0.3651910 POINT (-112.5296 27.40498)
## 6   161    64      63   0.2791050   POINT (-112.986 27.0367)
## 7   162    57      41   0.6136198   POINT (-112.408 27.2028)
## 8   163    21      21   0.4328730   POINT (-110.951 24.2115)
## 9   166    19      26   0.2673030 POINT (-112.0806 25.91409)
## 10  168    28      25   0.4964650 POINT (-111.2156 25.55757)

The epgs:3857 is used by Google Maps, ESRI, OpenStreet Maps, Bing, and other providers and is equivallent to (for those who care) to the following WKT definition:

 PROJCS["WGS 84 / Pseudo-Mercator",
    GEOGCS["WGS 84",
        DATUM["WGS_1984",
            SPHEROID["WGS 84",6378137,298.257223563,
                AUTHORITY["EPSG","7030"]],
            AUTHORITY["EPSG","6326"]],
        PRIMEM["Greenwich",0,
            AUTHORITY["EPSG","8901"]],
        UNIT["degree",0.0174532925199433,
            AUTHORITY["EPSG","9122"]],
        AUTHORITY["EPSG","4326"]],
    PROJECTION["Mercator_1SP"],
    PARAMETER["central_meridian",0],
    PARAMETER["scale_factor",1],
    PARAMETER["false_easting",0],
    PARAMETER["false_northing",0],
    UNIT["metre",1,
        AUTHORITY["EPSG","9001"]],
    AXIS["X",EAST],
    AXIS["Y",NORTH],
    EXTENSION["PROJ4","+proj=merc +a=6378137 +b=6378137 +lat_ts=0.0 +lon_0=0.0 +x_0=0.0 +y_0=0 +k=1.0 +units=m +nadgrids=@null +wktext  +no_defs"],
    AUTHORITY["EPSG","3857"]]

Bounding Boxes

A bounding box is the minimal set of coordinates that create a square box (in Longitude and Latitude) that encompass all the points. Here is the box for the sites we collected.

##       xmin       ymin       xmax       ymax 
## -114.29353   23.28550 -109.32700   29.32541

This is often useful in grabbing maps as we need to know the extent of the data.