+
\(+\)
\(=\)
This homework focuses on how we can use tidyverse routines to answer the exact same set of questions we addressed in the previous homework. The operative verbs include:
These can be combined in various ways to gain inferences from the raw data.
Using built-in routines, this amounts to using a lot of logical indices and making intermediate
data.frame
objects. However, using tidyverse foundations, it becomes a much easier process.
For these questions, we will be using the data set from the Rice Rivers Center and is loaded in as raw data from the code below.
library( readr )
url <- "https://docs.google.com/spreadsheets/d/1Mk1YGH9LqjF7drJE-td1G_JkdADOU0eMlrP01WFBT8s/pub?gid=0&single=true&output=csv"
rice <- read_csv( url )
summary( rice )
DateTime RecordID PAR WindSpeed_mph
Length:8199 Min. :43816 Min. : 0.000 Min. : 0.000
Class :character 1st Qu.:45866 1st Qu.: 0.000 1st Qu.: 2.467
Mode :character Median :47915 Median : 0.046 Median : 4.090
Mean :47915 Mean : 241.984 Mean : 5.446
3rd Qu.:49964 3rd Qu.: 337.900 3rd Qu.: 7.292
Max. :52014 Max. :1957.000 Max. :30.650
WindDir AirTempF RelHumidity BP_HG
Min. : 0.00 Min. : 3.749 Min. :15.37 Min. :29.11
1st Qu.: 37.31 1st Qu.:31.545 1st Qu.:42.25 1st Qu.:29.87
Median :137.30 Median :37.440 Median :56.40 Median :30.01
Mean :146.20 Mean :38.795 Mean :58.37 Mean :30.02
3rd Qu.:249.95 3rd Qu.:46.410 3rd Qu.:76.59 3rd Qu.:30.21
Max. :360.00 Max. :74.870 Max. :93.00 Max. :30.58
Rain_in H2O_TempC SpCond_mScm Salinity_ppt
Min. :0.0000000 Min. :-0.140 Min. :0.0110 Min. :0.0000
1st Qu.:0.0000000 1st Qu.: 3.930 1st Qu.:0.1430 1st Qu.:0.0700
Median :0.0000000 Median : 5.450 Median :0.1650 Median :0.0800
Mean :0.0008412 Mean : 5.529 Mean :0.1611 Mean :0.0759
3rd Qu.:0.0000000 3rd Qu.: 7.410 3rd Qu.:0.1760 3rd Qu.:0.0800
Max. :0.3470000 Max. :13.300 Max. :0.2110 Max. :0.1000
NA's :1 NA's :1 NA's :1
PH PH_mv Turbidity_ntu Chla_ugl
Min. :6.43 Min. :-113.8 Min. : 6.20 Min. : 1.3
1st Qu.:7.50 1st Qu.: -47.8 1st Qu.: 15.50 1st Qu.: 3.7
Median :7.58 Median : -43.8 Median : 21.80 Median : 6.7
Mean :7.60 Mean : -44.5 Mean : 24.54 Mean :137.3
3rd Qu.:7.69 3rd Qu.: -38.9 3rd Qu.: 30.30 3rd Qu.:302.6
Max. :9.00 Max. : 28.5 Max. :187.70 Max. :330.1
NA's :1 NA's :1 NA's :1 NA's :1
BGAPC_CML BGAPC_rfu ODO_sat ODO_mgl
Min. : 188 Min. : 0.10 Min. : 87.5 Min. :10.34
1st Qu.: 971 1st Qu.: 0.50 1st Qu.: 99.2 1st Qu.:12.34
Median : 1369 Median : 0.70 Median :101.8 Median :12.88
Mean :153571 Mean : 72.91 Mean :102.0 Mean :12.88
3rd Qu.:345211 3rd Qu.:163.60 3rd Qu.:104.1 3rd Qu.:13.34
Max. :345471 Max. :163.70 Max. :120.8 Max. :14.99
NA's :1 NA's :1 NA's :1 NA's :1
Depth_ft Depth_m SurfaceWaterElev_m_levelNad83m
Min. :12.15 Min. :3.705 Min. :-32.53
1st Qu.:14.60 1st Qu.:4.451 1st Qu.:-31.78
Median :15.37 Median :4.684 Median :-31.55
Mean :15.34 Mean :4.677 Mean :-31.55
3rd Qu.:16.12 3rd Qu.:4.913 3rd Qu.:-31.32
Max. :17.89 Max. :5.454 Max. :-30.78
Just like before, we will be answering the same set of questions as before. However, you should be using tidyverse approaches (pipes and such) to find the answers. Just like before, you should provide your answers as text (e.g., using complete sentences, etc.) and include visual output in tabular or graphical form to support your assertions. The key point here is that you need to develop an evidence-based narrative to address these questions.
On average, is there more rain on Mondays, at daytime, or at night?
What is the overall trend in salinity and pH? Does this pattern hold when considering each month individually?
Turbidity is a measurement of the opaqueness of water. In the rice data, we have a measure of Chlorophyll A in the water. For estimates where there is more than 200 \(µg*l^{-1}\), describe the relationship between these two variables.
Show the pattern of tides during the work week that includes Valentines Day in 2004.
Summarize estimates of Wind direction for February. Pay close attention to what this variable is actually measuring and how you want to display its underlying patterns.