In the Multivariate K-Nearest Neighbors lesson, we focused on increasing the number of attributes the model uses. We saw how, in general, adding more attributes generally lowered the error of the model. This is because the model is able to do a better job identifying the living spaces from the training set that are the most similar to the ones from the test set.
However, we also observed how using all of the available features didn't actually improve the model's accuracy automatically and that some of the features were probably not relevant for similarity ranking. We learned that selecting relevant features was the right lever when improving a model's accuracy, not just increasing the features used in the absolute.
In this lesson, we'll focus on the impact of increasing `k`, the number of nearby neighbors the model uses to make predictions. Values that affect the behavior and performance of a model that are unrelated to the data that's used are referred to as hyperparameters. The process of finding the optimal hyperparameter value is known as hyperparameter optimization.
The most common hyperparameter optimization technique is known as grid search. Grid search essentially boils down to evaluating the model performance at different k values and selecting the k value that resulted in the lowest error. In this lesson, we’ll be taking a look at how to apply this in our code.
As you work through each concept, you’ll get to apply what you’ve learned from within your browser so that there's no need to use your own machine to do the exercises. The Python environment inside of this course includes answer checking so you can ensure that you've fully mastered each concept before learning the next concept.
2. Hyperparameter optimization
3. Expanding grid search
4. Visualizing hyperparameter values
5. Varying features and hyperparameters
6. Practice the workflow
7. Next steps