Challenges Set 7

Instructions

Use the starwars dataset from the dplyr package (loaded already with tidyverse) to complete the below challenges. I highly recommend all of you to first get to know the starwars dataset by trying the “Get to know your data” functions covered at the beginning of the Week 2 Starter file. Good luck and have fun in completing them:

Challenge 1:

Write the code to complete the following:

  • Create a correlation matrix named “star_wars_corr_matrix” using all the numeric columns in the starwars dataset.

  • Visualize the “star_wars_corr_matrix”.

Note

Experiment visualizing the correlation matrix also using the corrplot.mixed() function. What do you notice?

Challenge 2:

Write the code to complete the following:

  • Set a random seed to reproduce the data splitting.

  • Define a data split with a 90/10 proportion, call the split as “star_wars_split”.

  • Create a “star_wars_train” and a “star_wars_test” object.

Challenge 3:

Create a recipe named “recipe_c3” by following the below steps:

  • Specify mass as the outcome variable and height, birth_year, species, gender and homeworld as predictors in the formula.

  • Make sure to build your model only on the train set.

  • Impute missing values in all the numeric variables with the median.

  • Impute missing values in all the nominal variables with the mode.

  • Standardize the average to 0 and sd to 1 for all the numeric columns.

  • Create dummy variables for the nominal columns.

Challenge 4:

Build a regression model:

  • Use parsnip to create a regression model and name it “linear_reg_model”

  • Set the model engine to “lm”.

  • Specify the mode as regression.

  • Display the model object.

Challenge 5:

Build a decision tree model:

  • Use parsnip to create a decision tree model and name it “decision_tree_model”

  • Set the model engine to “rpart”.

  • Specify the mode as regression.

  • Display the model object.

Challenge 6:

Create a workflow to complete the following:

  • Use the recipe created in Challenge 3.

  • Add the regression model from Challenge 4.

  • Fit the workflow to the starwars train dataset.

  • Display the results of the fitted workflow in a tidy format .

Challenge 7:

Create a workflow to complete the following:

  • Use the recipe created in Challenge 3.

  • Add the decision tree model from Challenge 5.

  • Fit the workflow to the starwars train dataset.

  • Display the results of the fitted workflow .

🛑 Don’t Click Submit Just Yet 🚧

Please read carefully the below information:

  • Once you have completed all the coding challenges, and your confident in your work, copy and paste your responses from the chunk into the form fields below each challenge.

  • You are responsible for correctly coping and pasting only the required code to solve each challenge We will grade only what you have submitted!

  • We will only grade 1 submission per student so do not click Submit until you are confident in your responses.

  • By submitting this form you are certifying that you have followed the academic integrity guidelines available in the syllabus. The code and answers submitted are the results of your work and your work only!

  • Make sure you have completed all the challenges and included all the required personal information (e.g., full name, email, zid) in the respective form’s fields. If you don’t know/want to complete a challenge just leave the field below it empty.

  • Now you are ready to click the above “Submit” button. Congrats you have completed this set of challenges!!!