Backward Elimination

Backward Elimination is conducted to select the possible predictors for linear regression. First is to start with a complete linear model with all predicting variables, and remove the predictor with the highest p-value(for which is greater than the threshold of 0.2). Then, refit the model and remove the next least significant predictor. Repeat the steps until all p-values are smaller than 0.2. The following tables show the first step with all predictors and the final step with confined predictors.

Linear Model with All Predictors
Estimate Std. Error t value Pr(>|t|)
(Intercept) -41.8101389 207.5949330 -0.2014025 0.8405407
FFMC 0.6894325 2.3396439 0.2946741 0.7684771
DMC 0.1465956 0.1304246 1.1239872 0.2620512
DC -0.0178292 0.0322502 -0.5528398 0.5808463
ISI -2.0528421 1.9503771 -1.0525360 0.2935272
temp 1.2795956 1.4165750 0.9033024 0.3671984
RH -0.4515129 0.4543571 -0.9937400 0.3212698
wind 2.6729698 3.1698746 0.8432415 0.3998655
rain -3.7434756 13.5730582 -0.2758019 0.7829184
Confined Linear Model with DMC and RH as predictors
Estimate Std. Error t value Pr(>|t|)
(Intercept) 36.7582234 18.5795824 1.978420 0.0489090
DMC 0.1289139 0.0848741 1.518884 0.1299749
RH -0.6161311 0.3477489 -1.771770 0.0775733

Multiple Linear Regression

The final predictors selected are DMC and RH(Relative Humidity), which p-values are greater than our threshold of 0.2. Our hypothesized formula for MLR model is: \[Burned Area = \beta_0 +\beta_1DMC + \beta_2RH\] The coeffieicnt plot for DMC and RH is shown below:

Regression coefficients for DMC and RH are modeled as 0.1289139 and -0.6161311 respectively; the intercept is 36.7582234. The resultant regression formula is: \[Burned Area = 36.7582234 +0.1289139DMC -0.6161311RH\]

Diagnostics

To examine the feasibility and accuracy of our model, plot of residuals was made. The residuals do not show a normal distribution, which indicates that the linear model utilized is not accurate and not appropriate for this dataset.

# Residuals
add_residuals(fire_linear, lm(area ~ DMC + RH, data = fire_linear)) %>%
  plot_ly(
    y = ~resid, type = "violin"
  ) %>% 
  layout(
    title = "Boxplot for Residuals",
    xaxis = list(title = "Climate Variable"),
    yaxis = list(title = "Residual", range = c(-200, 300))
  )