Backward Elimination is conducted to select the possible predictors for linear regression. First is to start with a complete linear model with all predicting variables, and remove the predictor with the highest p-value(for which is greater than the threshold of 0.2). Then, refit the model and remove the next least significant predictor. Repeat the steps until all p-values are smaller than 0.2. The following tables show the first step with all predictors and the final step with confined predictors.
Estimate | Std. Error | t value | Pr(>|t|) | |
---|---|---|---|---|
(Intercept) | -41.8101389 | 207.5949330 | -0.2014025 | 0.8405407 |
FFMC | 0.6894325 | 2.3396439 | 0.2946741 | 0.7684771 |
DMC | 0.1465956 | 0.1304246 | 1.1239872 | 0.2620512 |
DC | -0.0178292 | 0.0322502 | -0.5528398 | 0.5808463 |
ISI | -2.0528421 | 1.9503771 | -1.0525360 | 0.2935272 |
temp | 1.2795956 | 1.4165750 | 0.9033024 | 0.3671984 |
RH | -0.4515129 | 0.4543571 | -0.9937400 | 0.3212698 |
wind | 2.6729698 | 3.1698746 | 0.8432415 | 0.3998655 |
rain | -3.7434756 | 13.5730582 | -0.2758019 | 0.7829184 |
Estimate | Std. Error | t value | Pr(>|t|) | |
---|---|---|---|---|
(Intercept) | 36.7582234 | 18.5795824 | 1.978420 | 0.0489090 |
DMC | 0.1289139 | 0.0848741 | 1.518884 | 0.1299749 |
RH | -0.6161311 | 0.3477489 | -1.771770 | 0.0775733 |
The final predictors selected are DMC and RH(Relative Humidity), which p-values are greater than our threshold of 0.2. Our hypothesized formula for MLR model is: \[Burned Area = \beta_0 +\beta_1DMC + \beta_2RH\] The coeffieicnt plot for DMC and RH is shown below:
Regression coefficients for DMC and RH are modeled as 0.1289139 and -0.6161311 respectively; the intercept is 36.7582234. The resultant regression formula is: \[Burned Area = 36.7582234 +0.1289139DMC -0.6161311RH\]
To examine the feasibility and accuracy of our model, plot of residuals was made. The residuals do not show a normal distribution, which indicates that the linear model utilized is not accurate and not appropriate for this dataset.
# Residuals
add_residuals(fire_linear, lm(area ~ DMC + RH, data = fire_linear)) %>%
plot_ly(
y = ~resid, type = "violin"
) %>%
layout(
title = "Boxplot for Residuals",
xaxis = list(title = "Climate Variable"),
yaxis = list(title = "Residual", range = c(-200, 300))
)