Clustering Basics

In this module, you will use the k-means clustering machine learning algorithm to get familiar with the basics of clustering. k-means clustering uses Euclidean distance to form clusters of similar data points. You will learn about the k-means class from scikit-learn to perform clustering to understand different U.S. senators based on how they voted.

In past courses, we've looked at regression and classification. These are both types of supervised machine learning. In supervised learning, you train an algorithm to predict an unknown variable from known variables. Another major type of machine learning is called unsupervised learning. In unsupervised learning, we aren't trying to predict anything. Instead, we're finding patterns in data.

One of the main unsupervised learning techniques is called clustering. We use clustering when we're trying to explore a dataset, and understand the connections between the various rows and columns. Clustering is a key way to explore unknown data, and a very commonly used machine learning technique.

As you work through each concept, you’ll get to apply what you’ve learned from within your browser — there's no need to use your own machine to do the exercises. The Python environment inside of this course includes answer checking so you can ensure that you've fully mastered each concept before learning the next.


  • Learn how clustering helps you find patterns in data
  • Learn how to use k-means clustering

Mission Outline

1. Clustering overview
2. The dataset
3. Exploring the data
4. Distance between Senators
5. Initial clustering
6. Initial clustering
7. Exploring the clusters
8. Exploring Senators in the wrong cluster
9. Plotting out the clusters
10. Finding the most extreme
11. Next steps
12. Takeaways


Course Info:


The median completion time for this course is 5.6 hours. View Details

This course requires a premium subscription and includes 1 free mission and 5 paid missions, which includes 1 guided project.  It is the 21st course in the Data Scientist in Python path.


Take a Look Inside

(function(d) { d.addEventListener("DOMContentLoaded", function() { var pathname = d.location.pathname.replace(/^[/]|[/]$/g, "").replace("/", "-"); var tags = d.getElementsByTagName("iframe"); var type = pathname.startsWith("course") ? "?course=" : pathname.startsWith("path") ? "?path=" : null; if (type) { var i; for (i = 0; i < tags.length; i++) { if (tags[i].src.indexOf("signup#iframe") !== -1) { tags[i].src = tags[i].src.replace("#iframe", "") + type + pathname + "#iframe"; } } } }, false); })(document);