class: left, middle, inverse background-image: url("https://live.staticflickr.com/65535/50362989122_a8ee154fea_k_d.jpg") background-size: cover # .green[Functions: Code Compartmentalization] ### Environmental Data Literacy --- class: inverse, middle background-image: url("https://live.staticflickr.com/65535/50398542246_3971b73b4b_c_d.jpg") background-position: right background-size: fit .pull-left[.red[.fancy[Functions allow you to compartmentalize your code, so you can use it again and again.]]] --- # ⚙️ Making Functions A function is just a *chunk* of code, wrapped within curly brakets, and given a name. ```r foo <- function() { # R CODE GOES HERE } ``` -- The contents of a function may be: - A few lines of code or hundreds, - Calls to other functions, - Accepting variables as argument *input* (or not), and - Returning some object (or not). --- # Variable Scope Variables within a function are 'localized' to within that variable *alone*. This allows us to: 1. Not run out of variable names (`df` and `x` are common ones we've used thus far). 2. Not have the variables we are working with being clobbered every time we call another function. --- Define a function with a variable `x` inside the function. ```r foo <- function() { x <- 42 cat("x = ", x, "within function.\n") } ``` -- Then examine a similarly named variable with a different assigned value before and after calling that function. ```r x <- 23 cat("x = ", x, "before function call.") ``` ``` ## x = 23 before function call. ``` -- Within the function, the variable represents the value inside the function. ```r foo() ``` ``` ## x = 42 within function. ``` -- And after the function is called, examine the value of the variable. ```r cat("x = ", x, "after function call") ``` ``` ## x = 23 after function call ``` .blueinline[NOTICE: before and after function are in "global variable scope, whereas inside the function, the variable is localized within the boundaries of the function itself."] --- # Variable Scope The global scope is shown in RStudio on the *Environment* tab. .center[![](https://live.staticflickr.com/65535/50398760012_a53e2ea83d_c_d.jpg)] -- You can also see what is in the environment by asking to list all variables and functions using the `ls()` function. ```r ls() ``` ``` ## [1] "bounds" "Day_Or_Night" "directory" "favorite" ## [5] "file" "files" "foo" "freezing" ## [9] "grade" "hour" "i" "is_even" ## [13] "item" "l" "loadAndCropMyRasterMan" "my_func" ## [17] "r" "square_it" "summarize_levels" "who_is_in_the_house" ## [21] "x" "x2" ``` --- # Passing Varibles to A Function While some functions do not take input variables, many require it. To pass variables into the function scope, we identify them in the ```r foo <- function( x ) { cat("x =", x, "in the function.") } foo( 23 ) ``` ``` ## x = 23 in the function. ``` -- If you do not provide a value for a required variable, it will give you an error. ```r foo() ``` ``` ## Error in cat("x =", x, "in the function."): argument "x" is missing, with no default ``` --- # Default Function Variables We can put in *default* values for arguments by assigning values in the function parentheses. ```r favorite <- function( professor = "Dyer" ) { cat("My favorite professor is:", professor, "\n") } favorite() ``` ``` ## My favorite professor is: Dyer ``` --- # Getting Results from Functions In addition to sending data *to* a function, we often want to get something *back from* a function. We **must** be explicit about wanting to send something back to the caller using the `return()` function. ```r foo <- function( name = "Alice") { response <- paste( name, "is in the house.") return( response ) } ``` -- Then we can assign the results of that function to a variable. ```r who_is_in_the_house <- foo() who_is_in_the_house ``` ``` ## [1] "Alice is in the house." ``` --- class: sectionTitle, inverse # .orange[Conditional Execution] ### .fancy[IF this ELSE that] --- # Is it Cold Outside 🌡 It is very convenient to be able to execute come code under certain conditions and another set of code under other conditions. ```r freezing <- function( temperature ) { if( temperature <= 0 ) { print("Brrr") } else { print("Warm!") } } ``` -- ```r freezing( -2 ) ``` ``` ## [1] "Brrr" ``` ```r freezing( 18 ) ``` ``` ## [1] "Warm!" ``` --- # Multipart Conditionals Many times we may have to make decisions that have several alternative and exclusive conditions. We can extend the `IF/ELSE` pattern into an `IF/ELSE IF/ELSE IF/ELSE` pattern of arbitrary length. ```r grade <- function( percentage ) { if( percentage >= 90 ) { return( "A" ) } else if( percentage >= 80 ) { return( "B" ) } else if( percentage >= 70 ) { return( "C" ) } else if( percentage >= 60 ) { return( "D" ) } else { return( "F" ) } } ``` --- # Mutipart Conditional (cont.) ```r grade( 80 ) ``` ``` ## [1] "B" ``` ```r grade( 93 ) ``` ``` ## [1] "A" ``` ```r grade( 54 ) ``` ``` ## [1] "F" ``` --- # A Shortcut There are so many cases where we use the dichotomous `if/else` workflow, that we can simplify this code ```r if( condition ) { TRUE_CONDITION } else { FALSE_CONDITION } ``` -- Can be simplifed into ```r ifelse( CONDITION, TRUE_CONDITION, FALSE_CONDITION ) ``` --- # An Example Here is a relevant exmaple from the most recent homework. ```r hour <- sample(0:23, size = 18, replace = TRUE) hour ``` ``` ## [1] 17 1 3 22 2 7 8 11 8 3 7 3 5 10 21 5 19 4 ``` -- ```r Day_Or_Night <- ifelse( hour >= 7 & hour <=19, "Day", "Night") Day_Or_Night ``` ``` ## [1] "Day" "Night" "Night" "Night" "Night" "Day" "Day" "Day" "Day" "Night" "Day" "Night" "Night" "Day" ## [15] "Night" "Night" "Day" "Night" ``` --- class: sectionTitle, inverse # .green[Looping] ➿ --- # Looping There are times when it is very helpful if we can loop over a collection of things. These may be elements in a `vector`, rows in a `data.frame`, cycling through files on your computer to load in different data components, or to manipulate and analyze tweets harvested from the internet. -- **Use Case:** Determine which values are odd using the modulus operator `%%` .pull-left[ ```r x <- c(2, 4, 5, 7) x[1] %% 2 == 0 ``` ``` ## [1] TRUE ``` ```r x[2] %% 2 == 0 ``` ``` ## [1] TRUE ``` ```r x[3] %% 2 == 0 ``` ``` ## [1] FALSE ``` ```r x[4] %% 2 == 0 ``` ``` ## [1] FALSE ``` ] .pull-right[ Notice how the index used on `x` is simply a number that is incremented by `1` for each successive entry. > It sure would be nice if there were **some** method to do this automatically. ] --- # The `for` Loop The for loop is a function that iterates across items. These can be sequences such as `1:10` or as enumerated items in a vector. ```r for( i in 1:4) { cat(i, x[i] %% 2 == 0, "\n") } ``` ``` ## 1 TRUE ## 2 TRUE ## 3 FALSE ## 4 FALSE ``` -- .red[Notice: index variable `i` changes each time through the loop!] --- # The `for` Loop The for loop is a function that iterates across items. These can be sequences such as `1:10` or as enumerated items in a vector. ```r for( item in x) { cat(item, item %% 2 == 0, "\n") } ``` ``` ## 2 TRUE ## 4 TRUE ## 5 FALSE ## 7 FALSE ``` -- .red[Notice: the value passed is the actual item in the vector!] --- # Applying Functions to Values Sometimes it helps to apply a single function to each element in a vector. This is the basis for what we call `\(functional \; programming\)`, a kind of approach to working with objects that `R` supports. -- ```r is_even <- function( x ) { return(x %% 2 == 0) } lapply(x, is_even ) ``` ``` ## [[1]] ## [1] TRUE ## ## [[2]] ## [1] TRUE ## ## [[3]] ## [1] FALSE ## ## [[4]] ## [1] FALSE ``` --- # Simplifying the `lapply` Function The output from `lapply()` is a list, whose length is the same as that of the original data. However, it is sometimes necessary to just get back a vector of values for the results instead of futzing around with a list.<sup>1</sup>.footnote[<sup>1</sup>.small[`lapply(x,FUN,simplify=TRUE)`]] ```r sapply( x, is_even ) ``` ``` ## [1] TRUE TRUE FALSE FALSE ``` -- You can also define the function right in the call itself. Here is an example for finding odd values. ```r sapply( x, function( x ) { return( x %% 2 == 1) } ) ``` ``` ## [1] FALSE FALSE TRUE TRUE ``` --- # Mapping a Function There are some other ways to do the same kinds of operations that are a bit more verbose. .pull-left[ ![Mapping](https://live.staticflickr.com/65535/50953686326_50a2e2a5bd_o_d.png) ] -- .pull-right[ ```r square_it <- function( x ) { return( x*x ) } x2 <- Map( square_it, x) x2 ``` ``` ## [[1]] ## [1] 4 ## ## [[2]] ## [1] 16 ## ## [[3]] ## [1] 25 ## ## [[4]] ## [1] 49 ``` ```r unlist( x2 ) ``` ``` ## [1] 4 16 25 49 ``` ] --- # Using Filter (just like in tidyverse 👍) .pull-left[![Filtering Based Upon Logical](https://live.staticflickr.com/65535/50952984478_e51d070077_o_d.png)] .pull-right[ ```r x ``` ``` ## [1] 2 4 5 7 ``` ```r Filter(is_even, x ) ``` ``` ## [1] 2 4 ``` ] --- # Reduce This allows us to use some function to accumulate the values of elements along a vector. .pull-left[![Reduce operations](https://live.staticflickr.com/65535/50953791807_0685201446_o_d.png)] .pull-right[ ```r my_func <- function( x, y ) { return( x + y ) } Reduce(my_func, x, init=10 ) ``` ``` ## [1] 28 ``` Can check it like this: ```r sum( x ) + 10 ``` ``` ## [1] 28 ``` ] --- # A Non-Trivial Example Problem: Load all the Baja GEOTiff files in the data directory and then go through them and crop them to the area where *Araptus attenuatus* is found and save them locally as an R object. -- ### Process 1) Find the Data Directory 2) Iterate through the files. 3) Determine if each file is a GEOTiff 4) Process each GEOTiff --- # Download the Data Directory For this class (and the other ones that teach using R) I have a bunch of different data sets that we use. I've made an archive of them and put them on GoogleDrive (~57.3 MB in size). .pull-left[ - Download it from [here](https://drive.google.com/file/d/1jJUqPaho9TyltAYyGqlwG7G565tQqK5j/view?usp=sharing). - Open the folder where your R Project is located. - Put the director of data (appropriately called 'data' in that same folder) - When you access it, it will be at the *root* of your project directory. ] -- .pull-right[ Here is my Project Directory with the data file. ![The Data File in My Root Project Directory](https://live.staticflickr.com/65535/50953012413_3f5dea05bc_o_d.png) ] --- # Traversing the Filesystem So, assuming it is downloaded and in place, we can ask your operating system to give us a list of all the files in a particular folder using the `list.files()` function. At a minimum, you need to give it a starting path to begin the search. Your .green[Home] directory is .green[~] (`/Users/Rodney` **or** `C:\Users\Rodney`) ```r list.files("~") ``` ``` ## [1] "Applications" "Desktop" "Documents" "Downloads" "Library" "Movies" "Music" ## [8] "Pictures" "Public" ``` --- # Other Path Components The .green[Current] directory is .green[.] ```r list.files( ".") ``` ``` ## [1] "alt_22.tif" "Annual_Precip_22.tif" "homework.nb.html" "homework.Rmd" ## [5] "libs" "Maximum_Precip.tif" "Maximum_Temp_22.tif" "Mean_Temp_22.tif" ## [9] "Minimum_Precip_22.tif" "Minimum_Temp_22.tif" "narrative.nb.html" "narrative.Rmd" ## [13] "slides_files" "slides.html" "slides.Rmd" "summarize_levels.R" ## [17] "summarize_levels1.R" ``` The .green[Parent] directory to the .green[Current] one is .green[../] ```r list.files( "../" ) ``` ``` ## [1] "functions" "iteration" "Manipulations.key" "relational_data" "scraping_websites" ## [6] "tidyverse_1" "tidyverse_2" ``` ```r list.files( "../../" ) ``` ``` ## [1] "css" "exams" "index.html" "index.Rmd" "manipulation" "markdown" "models" ## [8] "r_language" "spatial" "visualization" ``` --- # Subdirectories The directory for this lecture is 3-levels below the root directory for this ```r list.files("../../../data/", recursive = TRUE) ``` ``` ## [1] "alt_22.tif" "alt.rda" "Annual_Precip_22.tif" "arapat.csv" ## [5] "Araptus_Disperal_Bias.csv" "Beer_Styles.csv" "Centerlines-shp.zip" "DistrictCodes.csv" ## [9] "EconomistData.csv" "Maximum_Precip.tif" "Maximum_Temp_22.tif" "Mean_Temp_22.tif" ## [13] "Minimum_Precip_22.tif" "Minimum_Temp_22.tif" "Roads.zip" "rps_schools.zip" ## [17] "VDOT_Main_Roads.zip" "virginia_county_fips.csv" "Zoning_Districts-shp.zip" ``` --- # Pattern Matching The `list.files` function can also support pattern matching. ```r list.files("../../../data/", recursive = TRUE, pattern = "*.tif") ``` ``` ## [1] "alt_22.tif" "Annual_Precip_22.tif" "Maximum_Precip.tif" "Maximum_Temp_22.tif" ## [5] "Mean_Temp_22.tif" "Minimum_Precip_22.tif" "Minimum_Temp_22.tif" ``` -- And providing the whole path to the file. ```r list.files("../../../data/", recursive = TRUE, pattern = "*.tif", full.names = TRUE) ``` ``` ## [1] "../../../data//alt_22.tif" "../../../data//Annual_Precip_22.tif" ## [3] "../../../data//Maximum_Precip.tif" "../../../data//Maximum_Temp_22.tif" ## [5] "../../../data//Mean_Temp_22.tif" "../../../data//Minimum_Precip_22.tif" ## [7] "../../../data//Minimum_Temp_22.tif" ``` --- # Back to the Problem .pull-left[ ### Process 1) Find the Data Directory 2) Iterate through the files. 3) Determine if each file is a GEOTiff 4) Process each GEOTiff - crop & save locally ] .pull-right[ ```r library( sf ) library( tidyverse ) read_csv("https://raw.githubusercontent.com/dyerlab/ENVS-Lectures/master/data/Araptus_Disperal_Bias.csv") %>% st_as_sf( coords=c("Longitude","Latitude"), crs=4326 ) %>% st_union() %>% st_buffer(dist = 1) %>% st_bbox() -> bounds bounds ``` ``` ## xmin ymin xmax ymax ## -115.29353 22.28550 -108.32700 30.32541 ``` ] --- # Processing Files Let's step through the process. ```r files <- list.files("../../../data/", recursive = TRUE, pattern = "*.tif", full.names = TRUE) files ``` ``` ## [1] "../../../data//alt_22.tif" "../../../data//Annual_Precip_22.tif" ## [3] "../../../data//Maximum_Precip.tif" "../../../data//Maximum_Temp_22.tif" ## [5] "../../../data//Mean_Temp_22.tif" "../../../data//Minimum_Precip_22.tif" ## [7] "../../../data//Minimum_Temp_22.tif" ``` Here is the basic code. ```r library( raster ) l <- list() for( file in files) { r <- raster( file ) r <- crop( r, bounds ) l[[basename(file)]] <- r writeRaster( r, filename = basename( file ), overwrite=TRUE ) } ``` --- # Did it work? We can see if it worked by looking in the current directory for all *.tif files ```r list.files(".", pattern="tif") ``` ``` ## [1] "alt_22.tif" "Annual_Precip_22.tif" "Maximum_Precip.tif" "Maximum_Temp_22.tif" ## [5] "Mean_Temp_22.tif" "Minimum_Precip_22.tif" "Minimum_Temp_22.tif" ``` -- .pull-left[ To delete a file you can use the `unlink()` function but . ```r for( file in files ) { unlink(basename(file)) } list.files(".", pattern="tif") ``` ``` ## character(0) ``` ] .pull-right[ .center[.red[CAUTION]] ![](https://live.staticflickr.com/65535/50954000847_804aeefabf_w_d.jpg) ] --- # Putting it All Together. We can put this all together as a function that loads all the GEOTiff files in a particular directory, crops them and saves them locally. ```r loadAndCropMyRasterMan <- function( directory, bounds ) { files <- list.files( path=directory, pattern = "*tif", full.names = TRUE ) for( file in files ) { r <- raster( file ) r <- crop( r, bounds ) writeRaster( r, filename = basename( file ), overwrite = TRUE ) cat("Saved", basename( file ), "\n") } } loadAndCropMyRasterMan( "../../../data", bounds) ``` ``` ## Saved alt_22.tif ## Saved Annual_Precip_22.tif ## Saved Maximum_Precip.tif ## Saved Maximum_Temp_22.tif ## Saved Mean_Temp_22.tif ## Saved Minimum_Precip_22.tif ## Saved Minimum_Temp_22.tif ``` --- class: sectionTitle, inverse # .green[Saving Functions] --- # Saving Functions - Within Markdown .pull-left[ Local Functions: - Used in a single file. - Place in its own *chunk*. - Available in that file *only*. ] -- .pull-right[ ```` ```{r echo=FALSE} best_class <- function( name = "ENVS543") { if( name == "ENVS543" ){ return( paste( name, "is my favorite class") ) } else { return( "Really?") } } ``` ```` ] --- # Saving Functions - Script Files .pull-left[ Broadly Applicable Functions: - Used in several files. - Place in its own *.R script file. - Save in a location you can access. ] -- .pull-right[ Here is an example function that summarizes levels from a `data.frame` ```r summarize_levels ``` ``` ## function( x, dataCol, groupCol, fun ) { ## ## vals <- by( x[[dataCol]], x[[groupCol]], fun) ## ## ret <- data.frame( names(vals) ) ## names( ret )[1] <- groupCol ## ## ret[[dataCol]] <- as.numeric( vals ) ## ## return( ret ) ## } ``` ] --- # Accessing Functions From Scripts 📜 To load in a function from an `*.R` file, you use the `source()` command and give it the path to the file .blueinline[*relative to your markdown*] file. ```r ls() ``` ``` ## [1] "bounds" "Day_Or_Night" "directory" "favorite" ## [5] "file" "files" "foo" "freezing" ## [9] "grade" "hour" "i" "is_even" ## [13] "item" "l" "loadAndCropMyRasterMan" "my_func" ## [17] "r" "square_it" "who_is_in_the_house" "x" ## [21] "x2" ``` -- After pulling it in, it is now available. ```r source("summarize_levels.R") ls() ``` ``` ## [1] "bounds" "Day_Or_Night" "directory" "favorite" ## [5] "file" "files" "foo" "freezing" ## [9] "grade" "hour" "i" "is_even" ## [13] "item" "l" "loadAndCropMyRasterMan" "my_func" ## [17] "r" "square_it" "summarize_levels" "who_is_in_the_house" ## [21] "x" "x2" ``` --- # Using a Function Just call it as a normal function. ```r summarize_levels( iris, dataCol = "Sepal.Length", groupCol = "Species", fun = mean ) %>% kable( caption="Table 1: Mean sepal length for three species of Iris." ) %>% kable_paper( bootstrap_options = "striped", full_width = FALSE ) ``` <table class=" lightable-paper" style='font-family: "Arial Narrow", arial, helvetica, sans-serif; width: auto !important; margin-left: auto; margin-right: auto;'> <caption>Table 1: Mean sepal length for three species of Iris.</caption> <thead> <tr> <th style="text-align:left;"> Species </th> <th style="text-align:right;"> Sepal.Length </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> setosa </td> <td style="text-align:right;"> 5.006 </td> </tr> <tr> <td style="text-align:left;"> versicolor </td> <td style="text-align:right;"> 5.936 </td> </tr> <tr> <td style="text-align:left;"> virginica </td> <td style="text-align:right;"> 6.588 </td> </tr> </tbody> </table> --- # Helpful Tips for Functions ![](https://live.staticflickr.com/65535/50398439551_f549a80586_c_d.jpg) --- # Storing Functions If you begin to create a lot of functions that you use all the time, you can put them together into a package and call them with `library()`, just like the other packages we've been using this whole semester. This is left as an advanced topic. --- class: middle background-image: url("images/contour.png") background-position: right background-size: auto .center[ # 🙋🏻♀️ Questions? ![Peter Sellers](https://live.staticflickr.com/65535/50382906427_2845eb1861_o_d.gif+) ] <p> </p> .bottom[ If you have any questions for about the content presented herein, please feel free to [submit them to me](mailto://rjdyer@vcu.edu) and I'll get back to you as soon as possible.]