MISSION 139

Introduction To K-Nearest Neighbors

In this lesson of the machine learning fundamentals course, we'll learn about k-nearest neighbors, which is a useful machine learning technique. Machine learning is a process of discovering patterns in existing data to make predictions. 

To learn how k-nearest neighbors works, we will explore AirBnB data to find optimal rental pricing — low enough to be competitive, but not so low that we miss out on potential revenue.

In the real world, you would likely use `scikit-learn` to implement the k-nearest neighbors algorithm. But understanding how this algorithm works “under the hood” will help you understand when to apply it, so in this mission we will implement the algorithm by hand. 

As we work through the AirBnB pricing question, we will walk through the entire machine learning workflow — from selecting a feature to testing the model — so you can see what it’s like to explore a problem like a data scientist would.

As with all our courses, you will be asked to apply what you’re learning in our in-browser app, which will also check your answers so you can ensure you've fully mastered each concept.

Objectives

  • The basics of the machine learning workflow.
  • How the k-nearest neighbors algorithm works.
  • The role of Euclidean distance in machine learning.

Mission Outline

1. Problem definition
2. Introduction to the data
3. K-nearest neighbors
4. Euclidean distance
5. Calculate distance for all observations
6. Randomizing, and sorting
7. Average price
8. Function to make predictions
9. Next steps
10. Takeaways

machine-learning-fundamentals

Course Info:

Beginner

The median completion time for this course is 7 hours. View Details

This course requires a premium subscription. This course includes five missions and one guided project.  It is the 17th course in the Data Scientist in Python path.

START LEARNING FREE

Take a Look Inside