adapted from explainxkcd.com
Loss=f(Error)
Purpose | Description |
Fitting | Find parameters that minimize loss function. |
Evaluation | Calculate loss function for fitted model. |
Loss=f(Error)
Purpose | Description |
Fitting | Find parameters that minimize loss function. |
Evaluation | Calculate loss function for fitted model. |
In regression, the criterion Y is modeled as the
ˆY=β0+β1×X1+β2×X2+...
The weight βi indiciates the
Ceteris paribus, the
If βi=0, then Xi
ˆY=Logistic(β0+β1×X1+...)
Logistic(x)=11+exp(−x)
ˆY=Logistic(β0+β1×X1+...)
Logistic(x)=11+exp(−x)
LogLoss=−1nn∑i(log(ˆy)y+log(1−ˆy)(1−y))
Loss01=1−Accuracy=1nn∑iI(y≠⌊ˆy⌉)
# set up recipe for regression modellm_recipe <- recipe(income ~ ., data = baselers) %>% step_dummy(all_nominal_predictors())lm_recipe
Data RecipeInputs: role #variables outcome 1 predictor 19Operations:Dummy variables from all_nominal_predictors()
# set up recipe for logistic regression# modellogistic_recipe <- recipe(eyecor ~., data = baselers) %>% step_dummy(all_nominal_predictors())logistic_recipe
Data RecipeInputs: role #variables outcome 1 predictor 19Operations:Dummy variables from all_nominal_predictors()
# set up model for regression modellm_model <- linear_reg() %>% set_engine("lm") %>% set_mode("regression")lm_model
Linear Regression Model Specification (regression)Computational engine: lm
# set up model for logistic regression# modellogistic_model <- logistic_reg() %>% set_engine("glm") %>% set_mode("classification")logistic_model
Logistic Regression Model Specification (classification)Computational engine: glm
# set up workflow for regression modellm_workflow <- workflow() %>% add_recipe(lm_recipe) %>% add_model(lm_model)lm_workflow
══ Workflow ══════════════════════════════════════════════════════════════════════════════════════════════════Preprocessor: RecipeModel: linear_reg()── Preprocessor ──────────────────────────────────────────────────────────────────────────────────────────────1 Recipe Step• step_dummy()── Model ─────────────────────────────────────────────────────────────────────────────────────────────────────Linear Regression Model Specification (regression)Computational engine: lm
# set up workflow for logistic regression# modellogistic_workflow <- workflow() %>% add_recipe(logistic_recipe) %>% add_model(logistic_model)logistic_workflow
══ Workflow ══════════════════════════════════════════════════════════════════════════════════════════════════Preprocessor: RecipeModel: logistic_reg()── Preprocessor ──────────────────────────────────────────────────────────────────────────────────────────────1 Recipe Step• step_dummy()── Model ─────────────────────────────────────────────────────────────────────────────────────────────────────Logistic Regression Model Specification (classification)Computational engine: glm
# fit the workflowincome_lm <- fit(lm_workflow, data = baselers)tidy(income_lm)
# A tibble: 25 × 5 term estimate std.error statistic p.value <chr> <dbl> <dbl> <dbl> <dbl> 1 (Intercept) -192. 631. -0.304 7.61e- 1 2 id 0.000895 0.113 0.00792 9.94e- 1 3 age 115. 2.88 40.1 2.23e-208 4 height 4.95 3.02 1.64 1.02e- 1 5 weight 1.01 3.27 0.307 7.59e- 1 6 children -48.9 31.9 -1.54 1.25e- 1 7 happiness -156. 31.1 -5.02 6.00e- 7 8 fitness 6.94 17.9 0.389 6.97e- 1 9 food 2.50 0.142 17.6 2.33e- 6010 alcohol 26.1 2.47 10.6 8.05e- 25# … with 15 more rows
# fit the logistic regression workfloweyecor_glm <- fit(logistic_workflow, data = baselers)tidy(eyecor_glm)
# A tibble: 25 × 5 term estimate std.error statistic p.value <chr> <dbl> <dbl> <dbl> <dbl> 1 (Intercept) -3.04 1.32 -2.31 0.0211 2 id 0.0000834 0.000236 0.354 0.723 3 age 0.00734 0.00973 0.755 0.451 4 height 0.00572 0.00630 0.907 0.364 5 weight 0.00446 0.00678 0.658 0.510 6 income -0.0000395 0.0000666 -0.593 0.553 7 children 0.0329 0.0665 0.495 0.621 8 happiness 0.0386 0.0653 0.591 0.554 9 fitness -0.0419 0.0372 -1.13 0.261 10 food -0.0000755 0.000339 -0.222 0.824 # … with 15 more rows
# generate predictionslm_pred <- income_lm %>% predict(baselers) %>% bind_cols(baselers %>% select(income))metrics(lm_pred, truth = income, estimate = .pred)
# A tibble: 3 × 3 .metric .estimator .estimate <chr> <chr> <dbl>1 rmse standard 1008. 2 rsq standard 0.8683 mae standard 792.
# generate predictions logistic regressionlogistic_pred <- predict(eyecor_glm, baselers, type = "prob") %>% bind_cols(predict(eyecor_glm, baselers)) %>% bind_cols(baselers %>% select(eyecor))metrics(logistic_pred, truth = eyecor, estimate = .pred_class, .pred_yes)
# A tibble: 4 × 3 .metric .estimator .estimate <chr> <chr> <dbl>1 accuracy binary 0.647 2 kap binary 0.05663 mn_log_loss binary 0.634 4 roc_auc binary 0.605
# ROC curve for logistic modellogistic_pred %>% roc_curve(truth = eyecor, .pred_yes) %>% autoplot()
adapted from explainxkcd.com
Keyboard shortcuts
↑, ←, Pg Up, k | Go to previous slide |
↓, →, Pg Dn, Space, j | Go to next slide |
Home | Go to first slide |
End | Go to last slide |
Number + Return | Go to specific slide |
b / m / f | Toggle blackout / mirrored / fullscreen mode |
c | Clone slideshow |
p | Toggle presenter mode |
t | Restart the presentation timer |
?, h | Toggle this help |
Esc | Back to slideshow |