Multivariate K-Nearest Neighbors
At the beginning of the course, we learned about the k-nearest neighbors algorithm and chose to use one feature to predict the rent price. As we learned the
caret library, we actually used two features to predict
tidy_price. We didn't go into the specifics of using two features, but in this mission we'll explore how adding new features to the algorithm changes how it classifies new listings.
Deciding a proper rental price for a listing is a complex task, requiring much more information than how many people a listing can accommodate. It could be helpful to incorporate more information into creating a prediction since rental listings have many qualities to them, qualities that could be shared.
By adding more features, we'll work to keep distances between similar listings small while increasing distances of listings that are not similar to ours. These two changes together help increase prediction quality.
- Generalizing the distance formula
- More data cleaning
- Handling missing values
- Fitting multiple models
- Comparing model performance
- Long format data
- group_by() and summarize()
- The power of piping
- Next steps