In the last mission, we learned how to incorporate k-fold cross-validation into our model creation in caret. We saw that caret actually does some extra work under the hood to figure out an optimal number of neighbors to use with the training data. All we need to do is supply the model with data, but it's important to understand what caret is doing in choosing this optimal number.

In this mission, we'll have a closer look at this process and learn how to manage it ourselves in caret. Up until now, we've mainly focused on cleaning the data and preparing it for the model, but here we will focus on how using different amounts of neighbors — the k of k-nearest neighbors — affects model performance.

The number of neighbors, k, is a facet of the algorithm itself, which we call a hyperparameter or model parameter. In this course, we will refer to them as hyperparameters, and we'll be diving into more depth on them in this mission.

Objectives

  • Learn to manage hyperparameters with caret
  • Build a model that performs well on new data

Lesson Outline

  1. Hyperparameters
  2. Hyperparameter optimization
  3. Using our hyperparameter grid
  4. Visualizing performance by hyperparameter
  5. Experimenting with hyperparameters
  6. Hyperparametes and cross-validation
  7. Next steps
  8. Takeaways