Tag Archives for " Scikit-Learn "

An Intro to Deep Learning in Python

Deep learning is a type of machine learning that’s growing at an almost frightening pace. Nearly every projection has the deep learning industry expanding massively over the next decade. This market research report, for example, expects deep learning to grow 71x in the US and more than that globally over the next ten years. There’s […]

Learning Curves for Machine Learning

Diagnose Bias and Variance to Reduce Error When building machine learning models, we want to keep error as low as possible. Two major sources of error are bias and variance. If we managed to reduce these two, then we could build more accurate models. But how do we diagnose bias and variance in the first […]

Machine Learning Fundamentals: Predicting Airbnb Prices

Machine learning is easily one of the biggest buzzwords in tech right now. Over the past three years, Google searches for “machine learning” have increased by over 350%. But understanding machine learning can be difficult — you either use pre-built packages that act like ‘black boxes’ where you pass in data and magic comes out […]

Preparing and Cleaning Data for Machine Learning

Cleaning and preparing data is a critical first step in any machine learning project. In this blog post, Dataquest student Daniel Osei takes us through examining a dataset, selecting columns for features, exploring the data visually and then encoding the features for machine learning. After first reading about Machine Learning on Quora in 2015, Daniel […]

Python for data science: Getting started

Python is becoming an increasingly popular language for data science, and with good reason. It’s easy to learn, has powerful data science libraries, and integrates well with databases and tools like Hadoop and Spark. With Python, we can perform the full lifecycle of data science projects, including reading data in, analyzing data, visualizing data, and […]


Tutorial: K Nearest Neighbors in Python

In this post, we’ll be using the K-nearest neighbors algorithm to predict how many points NBA players scored in the 2013-2014 season. Along the way, we’ll learn about euclidean distance and figure out which NBA players are the most similar to Lebron James. If you want to follow along, you can grab the dataset in […]


Tutorial: K-Means Clustering US Senators

Clustering is a powerful way to split up datasets into groups based on similarity. A very popular clustering algorithm is K-means clustering. In K-means clustering, we divide data up into a fixed number of clusters while trying to ensure that the items in each cluster are as similar as possible. In this post, we’ll explore […]

Share On Facebook
Share On Twitter
Share On Linkedin
Share On Reddit