class: left, middle, inverse background-image: url("https://live.staticflickr.com/65535/50559539697_1c35d0a56a_o_d.png") background-size: cover # .black[Regression Models
] ### .yellow[.fancy[Linear Models & <br>Model Comparison]] --- # Linear Models $$ y = \beta_0 + \beta_1 x_1 + \epsilon $$ .pull-left[ The basic linear model has: - An intercept ( `\(\beta_0\)` ), - A slope coefficient ( `\(\beta_1\)` ), and - And an error term ( `\(\epsilon\)` ). ] .pull-right[ <img src="slides_files/figure-html/unnamed-chunk-1-1.png" width="504" style="display: block; margin: auto;" /> ] --- # Building A Model - Random Search .pull-left[ ```r models <- data.frame( beta0 = runif(250,-20,40), beta1 = runif(250, -5, 5)) summary( models ) ``` ``` ## beta0 beta1 ## Min. :-19.840 Min. :-4.97599 ## 1st Qu.: -4.481 1st Qu.:-2.60468 ## Median : 12.735 Median :-0.01890 ## Mean : 10.938 Mean :-0.03569 ## 3rd Qu.: 26.349 3rd Qu.: 2.28999 ## Max. : 39.846 Max. : 4.97482 ``` ] -- .pull-right[ <img src="slides_files/figure-html/unnamed-chunk-3-1.png" width="504" style="display: block; margin: auto;" /> ] --- # Building A Model - Search Criterion .center[ <img src="slides_files/figure-html/unnamed-chunk-4-1.png" width="504" style="display: block; margin: auto;" /> ] --- # Searching Random Model Space ```r model_distance <- function( interscept, slope, X, Y ) { yhat <- interscept + slope * X diff <- Y - yhat return( sqrt( mean( diff ^ 2 ) ) ) } ``` -- ```r models$dist <- NA for( i in 1:nrow(models) ) { models$dist[i] <- model_distance( models$beta0[i], models$beta1[i], df$x, df$y ) } head( models ) ``` ``` ## beta0 beta1 dist ## 1 23.557176 4.315708 17.133718 ## 2 25.605363 -3.244048 29.759245 ## 3 11.118244 -4.251429 48.500099 ## 4 4.939457 1.705851 18.361322 ## 5 9.418283 3.570657 6.474642 ## 6 36.613684 -2.834294 19.691697 ``` --- # Top 10 Random Models .pull-left[ <img src="slides_files/figure-html/unnamed-chunk-7-1.png" width="504" style="display: block; margin: auto;" /> ] .pull-right[ The 10 best models (filtering in data= inside a `geom_abline()`) with original points. ```r ggplot() + geom_abline( aes(intercept = beta0, slope = beta1, color = -dist), data = filter( models, rank(dist) <= 10 ), alpha = 0.5) + geom_point( aes(x,y), data=df) ``` ] --- # The Best Coefficients ```r ggplot( models, aes(x = beta0, y = beta1, color = -dist)) + geom_point( data = filter( models, rank(dist) <= 10), color = "red", size = 4) + geom_point() ``` <img src="slides_files/figure-html/unnamed-chunk-9-1.png" width="504" style="display: block; margin: auto;" /> --- # Systematic Grid Search .pull-left[ <img src="slides_files/figure-html/unnamed-chunk-10-1.png" width="504" style="display: block; margin: auto;" /> ] .pull-right[ ```r grid <- expand.grid( beta0 = seq(15,20, length = 25), beta1 = seq(2, 3, length = 25)) grid$dist <- NA for( i in 1:nrow(grid) ) { grid$dist[i] <- model_distance( grid$beta0[i], grid$beta1[i], df$x, df$y ) } ggplot( grid, aes(x = beta0, y = beta1, color = -dist)) + geom_point( data = filter( grid, rank(dist) <= 10), color = "red", size = 4) + geom_point() ``` ] --- class: inverse, sectionTitle # .yellow[Our Friend `lm()`] ## .fancy[Linear Models] --- class: center, middle  --- # Specifying a Formula *Single Predictor Model* ``` y ~ x ``` *Multiple Additive Predictors* ``` y ~ x1 + x2 ``` *Interaction Terms* ``` y ~ x1 + x2 + x1*x2 ``` --- # Fitting A Model ```r fit <- lm( y ~ x, data = df ) fit ``` ``` ## ## Call: ## lm(formula = y ~ x, data = df) ## ## Coefficients: ## (Intercept) x ## 17.280 2.625 ``` --- # Model Summaries ```r summary( fit ) ``` ``` ## ## Call: ## lm(formula = y ~ x, data = df) ## ## Residuals: ## Min 1Q Median 3Q Max ## -7.9836 -4.0182 -0.8709 5.3064 6.9909 ## ## Coefficients: ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) 17.280 4.002 4.318 0.00255 ** ## x 2.626 0.645 4.070 0.00358 ** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Residual standard error: 5.859 on 8 degrees of freedom ## Multiple R-squared: 0.6744, Adjusted R-squared: 0.6337 ## F-statistic: 16.57 on 1 and 8 DF, p-value: 0.003581 ``` --- # Components within the Summary Object ```r names( summary( fit ) ) ``` ``` ## [1] "call" "terms" "residuals" "coefficients" ## [5] "aliased" "sigma" "df" "r.squared" ## [9] "adj.r.squared" "fstatistic" "cov.unscaled" ``` -- The probability can be found by looking at the data in the `F-Statistic` and then asking the F-distribution for the probability associated with the value of the test statistic and the degrees of freedom for both the model and the residuals. ```r summary( fit )$fstatistic ``` ``` ## value numdf dendf ## 16.56838 1.00000 8.00000 ``` ```r get_pval <- function( model ) { f <- summary( model )$fstatistic[1] df1 <- summary( model )$fstatistic[2] df2 <- summary( model )$fstatistic[3] p <- as.numeric( 1.0 - pf( f, df1, df2 ) ) return( p ) } get_pval( fit ) ``` ``` ## [1] 0.0035813 ``` --- # Model Diagnostics - Residuals ```r plot( fit, which = 1 ) ``` <img src="slides_files/figure-html/unnamed-chunk-17-1.png" width="504" style="display: block; margin: auto;" /> --- # Normality Of the Data ```r plot( fit, which = 2 ) ``` <img src="slides_files/figure-html/unnamed-chunk-18-1.png" width="504" style="display: block; margin: auto;" /> --- # Leverage ```r plot( fit, which=5 ) ``` <img src="slides_files/figure-html/unnamed-chunk-19-1.png" width="504" style="display: block; margin: auto;" /> --- # Decomposition of Variance The terms in this table are: - Degrees of Freedom (*df*): representing `1` degree of freedom for the model, and `N-1` for the residuals. - Sums of Squared Deviations: - `\(SS_{Total} = \sum_{i=1}^N (y_i - \bar{y})^2\)` - `\(SS_{Model} = \sum_{i=1}^N (\hat{y}_i - \bar{y})^2\)`, and - `\(SS_{Residual} = SS_{Total} - SS_{Model}\)` - Mean Squares (Standardization of the Sums of Squares for the degrees of freedom) - `\(MS_{Model} = \frac{SS_{Model}}{df_{Model}}\)` - `\(MS_{Residual} = \frac{SS_{Residual}}{df_{Residual}}\)` - The `\(F\)`-statistic is from a known distribution and is defined by the ratio of Mean Squared values. - `Pr(>F)` is the probability associated the value of the `\(F\)`-statistic and is dependent upon the degrees of freedom for the model and residuals. --- # Decomposition of Variance ```r anova( fit ) ``` ``` ## Analysis of Variance Table ## ## Response: y ## Df Sum Sq Mean Sq F value Pr(>F) ## x 1 568.67 568.67 16.568 0.003581 ** ## Residuals 8 274.58 34.32 ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ``` --- # Variance Exaplained ```r summary( fit ) ``` ``` ## ## Call: ## lm(formula = y ~ x, data = df) ## ## Residuals: ## Min 1Q Median 3Q Max ## -7.9836 -4.0182 -0.8709 5.3064 6.9909 ## ## Coefficients: ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) 17.280 4.002 4.318 0.00255 ** ## x 2.626 0.645 4.070 0.00358 ** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Residual standard error: 5.859 on 8 degrees of freedom ## Multiple R-squared: 0.6744, Adjusted R-squared: 0.6337 ## F-statistic: 16.57 on 1 and 8 DF, p-value: 0.003581 ``` --- # Relationship Between `\(R^2\)` & `\(r\)` How much of the variation is explained? `$$R^2 = \frac{SS_{Model}}{SS_{Total}}$$` -- ```r c( `Regression R^2` = summary( fit )$r.squared, `Squared Correlation` = as.numeric( cor.test( df$x, df$y )$estimate^2 ) ) ``` ``` ## Regression R^2 Squared Correlation ## 0.6743782 0.6743782 ``` > The square of the Pearson Correlation is equal to R --- ## Helper Functions .pull-left[ Grabbing the predicted values `\(\hat{y}\)` from the model. ```r predict( fit ) -> yhat yhat ``` ``` ## 1 2 3 4 5 6 7 8 ## 19.90545 22.53091 25.15636 27.78182 30.40727 33.03273 35.65818 38.28364 ## 9 10 ## 40.90909 43.53455 ``` ```r plot( yhat ~ df$x, type='l', bty="n", col="red" ) ``` ] .pull-right[ <img src="slides_files/figure-html/unnamed-chunk-25-1.png" width="504" style="display: block; margin: auto;" /> ] --- # Helper Functions - Residuals .pull-left[ The residuals are the distances between the observed value and its corresponding value on the fitted line. ```r residuals( fit ) ``` ``` ## 1 2 3 4 5 6 7 ## -4.4054545 5.5690909 -2.8563636 4.5181818 0.6927273 -6.2327273 6.1418182 ## 8 9 10 ## -7.9836364 6.9909091 -2.4345455 ``` ] .pull-right[ <img src="slides_files/figure-html/unnamed-chunk-27-1.png" width="504" style="display: block; margin: auto;" /> ] --- class: inverse, sectionTitle # .yellow[Comparing Models] --- # What Makes One Model Better There are two parameters that we have already looked at that may help. These are: - The `\(P-value\)`: Models with smaller probabilities could be considered more informative. - The `\(R^2\)`: Models that explain more of the variation may be considered more informative. -- Let's start by looking at some airquality data we have played with previously when working on [data.frame objects](https://dyerlab.github.io/ENVS-Lectures/r_language/data_frames/homework.nb.html). ```r airquality %>% select( -Month, -Day ) -> df.air summary( df.air ) ``` ``` ## Ozone Solar.R Wind Temp ## Min. : 1.00 Min. : 7.0 Min. : 1.700 Min. :56.00 ## 1st Qu.: 18.00 1st Qu.:115.8 1st Qu.: 7.400 1st Qu.:72.00 ## Median : 31.50 Median :205.0 Median : 9.700 Median :79.00 ## Mean : 42.13 Mean :185.9 Mean : 9.958 Mean :77.88 ## 3rd Qu.: 63.25 3rd Qu.:258.8 3rd Qu.:11.500 3rd Qu.:85.00 ## Max. :168.00 Max. :334.0 Max. :20.700 Max. :97.00 ## NA's :37 NA's :7 ``` --- # Base Models - What Influences Ozone ```r fit.solar <- lm( Ozone ~ Solar.R, data = df.air ) fit.temp <- lm( Ozone ~ Temp, data = df.air ) fit.wind <- lm( Ozone ~ Wind, data = df.air ) ``` -- <table class=" lightable-classic-2" style='font-family: "Arial Narrow", "Source Sans Pro", sans-serif; margin-left: auto; margin-right: auto;'> <caption>Model parameters predicting mean ozone in parts per billion mearsured in New York during the period of 1 May 2973 - 30 September 2973 as predicted by Temperature, Windspeed, and Solar Radiation.</caption> <thead> <tr> <th style="text-align:left;"> Model </th> <th style="text-align:left;"> R2 </th> <th style="text-align:left;"> P </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> Ozone ~ Solar </td> <td style="text-align:left;"> 0.121 </td> <td style="text-align:left;"> 1.79e-04 </td> </tr> <tr> <td style="text-align:left;"> Ozone ~ Temp </td> <td style="text-align:left;"> 0.488 </td> <td style="text-align:left;"> 0.00e+00 </td> </tr> <tr> <td style="text-align:left;"> Ozone ~ Wind </td> <td style="text-align:left;"> 0.362 </td> <td style="text-align:left;"> 9.27e-13 </td> </tr> </tbody> </table> --- # More Complicated Models Multiple Regression Model - Including more than one predictors. `\(y = \beta_0 + \beta_1 x_1 + beta_2 x_2 + \epsilon\)` ```r fit.temp.wind <- lm( Ozone ~ Temp + Wind, data = df.air ) fit.temp.solar <- lm( Ozone ~ Temp + Solar.R, data = df.air ) fit.wind.solar <- lm( Ozone ~ Wind + Solar.R, data = df.air ) ``` -- <table class=" lightable-classic-2" style='font-family: "Arial Narrow", "Source Sans Pro", sans-serif; margin-left: auto; margin-right: auto;'> <caption>Model parameters predicting mean ozone in parts per billion mresured in New York during the period of 1 May 2973 - 30 September 2973.</caption> <thead> <tr> <th style="text-align:left;"> Model </th> <th style="text-align:left;"> R2 </th> <th style="text-align:left;"> P </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> Ozone ~ Solar </td> <td style="text-align:left;"> 0.121 </td> <td style="text-align:left;"> 1.79e-04 </td> </tr> <tr> <td style="text-align:left;"> Ozone ~ Temp </td> <td style="text-align:left;"> 0.488 </td> <td style="text-align:left;"> 0.00e+00 </td> </tr> <tr> <td style="text-align:left;"> Ozone ~ Wind </td> <td style="text-align:left;"> 0.362 </td> <td style="text-align:left;"> 9.27e-13 </td> </tr> <tr> <td style="text-align:left;"> Ozone ~ Temp + Wind </td> <td style="text-align:left;"> 0.569 </td> <td style="text-align:left;"> 0.00e+00 </td> </tr> <tr> <td style="text-align:left;"> Ozone ~ Temp + Solar </td> <td style="text-align:left;"> 0.510 </td> <td style="text-align:left;"> 0.00e+00 </td> </tr> <tr> <td style="text-align:left;"> Ozone ~ Wind + Solar </td> <td style="text-align:left;"> 0.449 </td> <td style="text-align:left;"> 9.99e-15 </td> </tr> </tbody> </table> --- # For Completeness How about all the predictors. `\(y = \beta_0 + \beta_1 x_1 + \beta_2 x_2 + \beta_3 x_3 + \epsilon\)` ```r fit.all <- lm( Ozone ~ Solar.R + Temp + Wind, data = df.air ) ``` -- <table class=" lightable-paper lightable-striped" style='font-family: "Arial Narrow", arial, helvetica, sans-serif; width: auto !important; margin-left: auto; margin-right: auto;'> <thead> <tr> <th style="text-align:left;"> Model </th> <th style="text-align:left;"> R2 </th> <th style="text-align:left;"> P </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> Ozone ~ Solar </td> <td style="text-align:left;"> <span style=" color: black !important;">1.21e-01</span> </td> <td style="text-align:left;"> <span style=" color: black !important;">1.79e-04</span> </td> </tr> <tr> <td style="text-align:left;"> Ozone ~ Temp </td> <td style="text-align:left;"> <span style=" color: black !important;">4.88e-01</span> </td> <td style="text-align:left;"> <span style=" color: red !important;">0.00e+00</span> </td> </tr> <tr> <td style="text-align:left;"> Ozone ~ Wind </td> <td style="text-align:left;"> <span style=" color: black !important;">3.62e-01</span> </td> <td style="text-align:left;"> <span style=" color: black !important;">9.27e-13</span> </td> </tr> <tr> <td style="text-align:left;"> Ozone ~ Temp + Wind </td> <td style="text-align:left;"> <span style=" color: black !important;">5.69e-01</span> </td> <td style="text-align:left;"> <span style=" color: red !important;">0.00e+00</span> </td> </tr> <tr> <td style="text-align:left;"> Ozone ~ Temp + Solar </td> <td style="text-align:left;"> <span style=" color: black !important;">5.10e-01</span> </td> <td style="text-align:left;"> <span style=" color: red !important;">0.00e+00</span> </td> </tr> <tr> <td style="text-align:left;"> Ozone ~ Wind + Solar </td> <td style="text-align:left;"> <span style=" color: black !important;">4.49e-01</span> </td> <td style="text-align:left;"> <span style=" color: black !important;">9.99e-15</span> </td> </tr> <tr> <td style="text-align:left;"> Ozone ~ Temp + Wind + Solar </td> <td style="text-align:left;"> <span style=" color: green !important;">6.06e-01</span> </td> <td style="text-align:left;"> <span style=" color: red !important;">0.00e+00</span> </td> </tr> </tbody> </table> --- ## `\(R^2\)` Inflation Any variable added to a model will be able to generate *Sums of Squares* (even if it is a small amount). So, `adding variables may artifically inflate the Model Sums of Squares`. Example: > What happens if I add just random data to the regression models? How does `\(R^2\)` change? --- # Random Data Effects .pull-left[ <table class=" lightable-paper lightable-striped" style='font-family: "Arial Narrow", arial, helvetica, sans-serif; width: auto !important; margin-left: auto; margin-right: auto;'> <caption>Original Models</caption> <thead> <tr> <th style="text-align:left;"> Models </th> <th style="text-align:left;"> R2 </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> Ozone ~ Temp </td> <td style="text-align:left;"> 0.4877 </td> </tr> <tr> <td style="text-align:left;"> Ozone ~ Wind </td> <td style="text-align:left;"> 0.3619 </td> </tr> <tr> <td style="text-align:left;"> Ozone ~ Solar </td> <td style="text-align:left;"> 0.1213 </td> </tr> <tr> <td style="text-align:left;"> Ozone ~ Temp + Wind </td> <td style="text-align:left;"> 0.5687 </td> </tr> <tr> <td style="text-align:left;"> Ozone ~ Temp + Solar </td> <td style="text-align:left;"> 0.5103 </td> </tr> <tr> <td style="text-align:left;"> Ozone ~ Wind + Solar </td> <td style="text-align:left;"> 0.4495 </td> </tr> <tr> <td style="text-align:left;"> Ozone ~ Temp + Wind + Solar </td> <td style="text-align:left;"> 0.6059 </td> </tr> </tbody> </table> ] -- .pull-right[ <table class=" lightable-paper lightable-striped" style='font-family: "Arial Narrow", arial, helvetica, sans-serif; width: auto !important; margin-left: auto; margin-right: auto;'> <caption>Original Models + Random Variables</caption> <thead> <tr> <th style="text-align:left;"> Models </th> <th style="text-align:left;"> R2 </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> Ozone ~ Temp + Wind + Solar + 1 Random Variables </td> <td style="text-align:left;"> 0.6060 </td> </tr> <tr> <td style="text-align:left;"> Ozone ~ Temp + Wind + Solar + 2 Random Variables </td> <td style="text-align:left;"> 0.6176 </td> </tr> <tr> <td style="text-align:left;"> Ozone ~ Temp + Wind + Solar + 3 Random Variables </td> <td style="text-align:left;"> 0.6184 </td> </tr> <tr> <td style="text-align:left;"> Ozone ~ Temp + Wind + Solar + 4 Random Variables </td> <td style="text-align:left;"> 0.6203 </td> </tr> <tr> <td style="text-align:left;"> Ozone ~ Temp + Wind + Solar + 5 Random Variables </td> <td style="text-align:left;"> 0.6240 </td> </tr> <tr> <td style="text-align:left;"> Ozone ~ Temp + Wind + Solar + 6 Random Variables </td> <td style="text-align:left;"> 0.6277 </td> </tr> <tr> <td style="text-align:left;"> Ozone ~ Temp + Wind + Solar + 7 Random Variables </td> <td style="text-align:left;"> 0.6330 </td> </tr> <tr> <td style="text-align:left;"> Ozone ~ Temp + Wind + Solar + 8 Random Variables </td> <td style="text-align:left;"> 0.6330 </td> </tr> </tbody> </table> ] --- # Perfect - My Models RULE #### I can just add **random** variables to my model and always get an .redinline[awesome] fit! <p> </p> .center[ <iframe src="https://giphy.com/embed/7ymcoEE72hEf6" width="480" height="225" frameBorder="0" class="giphy-embed" allowFullScreen></iframe> ] <p> </p> .orangeinline[Not so fast Bevis.] --- # Model Comparisons Akaike Information Criterion (AIC) is a measurement that allows us to compare models while penalizing for adding new parameters. `\(AIC = -2 \ln L + 2p\)` The criterion here are to find models with the lowest AIC values. -- ## Comparisons To compare, we evaluate the differences in AIC for alternative models. `\(\delta AIC = AIC - min( AIC )\)` --- # AIC & ∂AIC .pull-left[ <table class=" lightable-paper lightable-striped" style='font-family: "Arial Narrow", arial, helvetica, sans-serif; width: auto !important; margin-left: auto; margin-right: auto;'> <thead> <tr> <th style="text-align:left;"> Models </th> <th style="text-align:right;"> R2 </th> <th style="text-align:right;"> AIC </th> <th style="text-align:right;"> deltaAIC </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> Ozone ~ Temp </td> <td style="text-align:right;"> 0.488 </td> <td style="text-align:right;"> 1067.706 </td> <td style="text-align:right;"> 68.989 </td> </tr> <tr> <td style="text-align:left;"> Ozone ~ Wind </td> <td style="text-align:right;"> 0.362 </td> <td style="text-align:right;"> 1093.187 </td> <td style="text-align:right;"> 94.470 </td> </tr> <tr> <td style="text-align:left;"> Ozone ~ Solar </td> <td style="text-align:right;"> 0.121 </td> <td style="text-align:right;"> 1083.714 </td> <td style="text-align:right;"> 84.997 </td> </tr> <tr> <td style="text-align:left;"> Ozone ~ Temp + Wind </td> <td style="text-align:right;"> 0.569 </td> <td style="text-align:right;"> 1049.741 </td> <td style="text-align:right;"> 51.024 </td> </tr> <tr> <td style="text-align:left;"> Ozone ~ Temp + Solar </td> <td style="text-align:right;"> 0.510 </td> <td style="text-align:right;"> 1020.820 </td> <td style="text-align:right;"> 22.103 </td> </tr> <tr> <td style="text-align:left;"> Ozone ~ Wind + Solar </td> <td style="text-align:right;"> 0.449 </td> <td style="text-align:right;"> 1033.816 </td> <td style="text-align:right;"> 35.098 </td> </tr> <tr> <td style="text-align:left;"> Ozone ~ Temp + Wind + Solar </td> <td style="text-align:right;"> 0.606 </td> <td style="text-align:right;"> 998.717 </td> <td style="text-align:right;"> 0.000 </td> </tr> </tbody> </table> ] -- .pull-right[ <table class=" lightable-paper lightable-striped" style='font-family: "Arial Narrow", arial, helvetica, sans-serif; width: auto !important; margin-left: auto; margin-right: auto;'> <thead> <tr> <th style="text-align:left;"> Models </th> <th style="text-align:right;"> R2 </th> <th style="text-align:right;"> AIC </th> <th style="text-align:right;"> deltaAIC </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> Ozone ~ Temp + Wind + Solar + 1 Random Variables </td> <td style="text-align:right;"> 0.606 </td> <td style="text-align:right;"> 1000.701 </td> <td style="text-align:right;"> 1.983 </td> </tr> <tr> <td style="text-align:left;"> Ozone ~ Temp + Wind + Solar + 2 Random Variables </td> <td style="text-align:right;"> 0.618 </td> <td style="text-align:right;"> 999.382 </td> <td style="text-align:right;"> 0.665 </td> </tr> <tr> <td style="text-align:left;"> Ozone ~ Temp + Wind + Solar + 3 Random Variables </td> <td style="text-align:right;"> 0.618 </td> <td style="text-align:right;"> 1001.151 </td> <td style="text-align:right;"> 2.434 </td> </tr> <tr> <td style="text-align:left;"> Ozone ~ Temp + Wind + Solar + 4 Random Variables </td> <td style="text-align:right;"> 0.620 </td> <td style="text-align:right;"> 1002.593 </td> <td style="text-align:right;"> 3.876 </td> </tr> <tr> <td style="text-align:left;"> Ozone ~ Temp + Wind + Solar + 5 Random Variables </td> <td style="text-align:right;"> 0.624 </td> <td style="text-align:right;"> 1003.503 </td> <td style="text-align:right;"> 4.785 </td> </tr> <tr> <td style="text-align:left;"> Ozone ~ Temp + Wind + Solar + 6 Random Variables </td> <td style="text-align:right;"> 0.628 </td> <td style="text-align:right;"> 1004.413 </td> <td style="text-align:right;"> 5.696 </td> </tr> <tr> <td style="text-align:left;"> Ozone ~ Temp + Wind + Solar + 7 Random Variables </td> <td style="text-align:right;"> 0.633 </td> <td style="text-align:right;"> 1004.822 </td> <td style="text-align:right;"> 6.105 </td> </tr> </tbody> </table> ] --- class: inverse, sectionTitle # .yellow[Stepwise Regression] --- # Fitting Several Features What if we have 10 predictor variables and are interested in fitting the `best` model des .pull-left[ ### Ap ] .pull-right[ ] --- class: middle background-position: right background-size: auto .center[ # Questions?  ] <p> </p> .bottom[ If you have any questions for about the content presented herein, please feel free to [submit them to me](mailto://rjdyer@vcu.edu) and I'll get back to you as soon as possible.]