+
\(+\)
\(=\)
This homework focuses on how we can use tidyverse routines to become more effective in our pre-treatment of data. As a reminder, the operative verbs we use in data preparation include:
These can be combined in various ways to gain inferences from the raw data.
## The Data
For these questions, we will be using the data set from the Rice Rivers Center and is loaded in as raw data from the code below.
library( readr )
library( tidyverse )
library( lubridate )
url <- "https://docs.google.com/spreadsheets/d/1Mk1YGH9LqjF7drJE-td1G_JkdADOU0eMlrP01WFBT8s/pub?gid=0&single=true&output=csv"
rice <- read_csv( url )
summary( rice )
DateTime RecordID PAR WindSpeed_mph WindDir
Length:8199 Min. :43816 Min. : 0.000 Min. : 0.000 Min. : 0.00
Class :character 1st Qu.:45866 1st Qu.: 0.000 1st Qu.: 2.467 1st Qu.: 37.31
Mode :character Median :47915 Median : 0.046 Median : 4.090 Median :137.30
Mean :47915 Mean : 241.984 Mean : 5.446 Mean :146.20
3rd Qu.:49964 3rd Qu.: 337.900 3rd Qu.: 7.292 3rd Qu.:249.95
Max. :52014 Max. :1957.000 Max. :30.650 Max. :360.00
AirTempF RelHumidity BP_HG Rain_in H2O_TempC
Min. : 3.749 Min. :15.37 Min. :29.11 Min. :0.0000000 Min. :-0.140
1st Qu.:31.545 1st Qu.:42.25 1st Qu.:29.87 1st Qu.:0.0000000 1st Qu.: 3.930
Median :37.440 Median :56.40 Median :30.01 Median :0.0000000 Median : 5.450
Mean :38.795 Mean :58.37 Mean :30.02 Mean :0.0008412 Mean : 5.529
3rd Qu.:46.410 3rd Qu.:76.59 3rd Qu.:30.21 3rd Qu.:0.0000000 3rd Qu.: 7.410
Max. :74.870 Max. :93.00 Max. :30.58 Max. :0.3470000 Max. :13.300
NA's :1
SpCond_mScm Salinity_ppt PH PH_mv Turbidity_ntu
Min. :0.0110 Min. :0.0000 Min. :6.43 Min. :-113.8 Min. : 6.20
1st Qu.:0.1430 1st Qu.:0.0700 1st Qu.:7.50 1st Qu.: -47.8 1st Qu.: 15.50
Median :0.1650 Median :0.0800 Median :7.58 Median : -43.8 Median : 21.80
Mean :0.1611 Mean :0.0759 Mean :7.60 Mean : -44.5 Mean : 24.54
3rd Qu.:0.1760 3rd Qu.:0.0800 3rd Qu.:7.69 3rd Qu.: -38.9 3rd Qu.: 30.30
Max. :0.2110 Max. :0.1000 Max. :9.00 Max. : 28.5 Max. :187.70
NA's :1 NA's :1 NA's :1 NA's :1 NA's :1
Chla_ugl BGAPC_CML BGAPC_rfu ODO_sat ODO_mgl
Min. : 1.3 Min. : 188 Min. : 0.10 Min. : 87.5 Min. :10.34
1st Qu.: 3.7 1st Qu.: 971 1st Qu.: 0.50 1st Qu.: 99.2 1st Qu.:12.34
Median : 6.7 Median : 1369 Median : 0.70 Median :101.8 Median :12.88
Mean :137.3 Mean :153571 Mean : 72.91 Mean :102.0 Mean :12.88
3rd Qu.:302.6 3rd Qu.:345211 3rd Qu.:163.60 3rd Qu.:104.1 3rd Qu.:13.34
Max. :330.1 Max. :345471 Max. :163.70 Max. :120.8 Max. :14.99
NA's :1 NA's :1 NA's :1 NA's :1 NA's :1
Depth_ft Depth_m SurfaceWaterElev_m_levelNad83m
Min. :12.15 Min. :3.705 Min. :-32.53
1st Qu.:14.60 1st Qu.:4.451 1st Qu.:-31.78
Median :15.37 Median :4.684 Median :-31.55
Mean :15.34 Mean :4.677 Mean :-31.55
3rd Qu.:16.12 3rd Qu.:4.913 3rd Qu.:-31.32
Max. :17.89 Max. :5.454 Max. :-30.78
Provide your answers as text (e.g., using complete sentences, etc.) and include visual output in tabular or graphical form to support your assertions. The key point here is that you need to develop an evidence-based narrative to address these questions.
In all of the following questions, use the operative verbs as well as the pipe operator %>%
to extract from the data the required output.