Learn about machine learning in Python and build your very first ML model from scratch to predict Airbnb prices using k-nearest neighbors.
Learn data cleaning for a machine learning project by cleaning and preparing loan data from LendingClub for a predictive analytics project.
A collection of the best places to find free data sets for data visualization, data cleaning, machine learning, and data processing projects.
Are you ready to learn Python for Data Science? With the right program, habits and structure you can master it more quickly than you might think.
Take the first step into image analysis in Python by using k-means clustering to analyze the dominant colors in an image in this free data science tutorial.
Deep learning is a type of machine learning that’s growing at an almost frightening pace. Nearly every projection has the deep learning industry expanding massively over the next decade. This market research report, for example, expects deep learning to grow 71x in the US and more than that globally over the next ten years. There’s […]
Learn to use Python dictionaries to store, sort, and access data in this in-depth tutorial analyzing craft beer data to master dictionary techniques.
Error metrics are short and useful summaries of the quality of our data. We dive into four common regression metrics and discuss their use cases.
Getting into Machine Learning and AI is not an easy task, but is a critical part of data science programs. Many aspiring professionals and enthusiasts find it hard to establish a proper path into the field, given the enormous amount of resources available today. The field is evolving constantly and it is crucial that we […]
Learn how to do descriptive statistics in Python with this in-depth tutorial that covers the basics (mean, median, and mode) and more advanced topics.
Python generators are a powerful, but misunderstood tool. They’re often treated as too difficult a concept for beginning programmers to learn — creating the illusion that beginners should hold off on learning generators until they are ready. I think this assessment is unfair, and that you can use generators sooner than you think. In this […]
The data science life cycle is generally comprised of the following components: data retrieval data cleaning data exploration and visualization statistical or predictive modeling While these components are helpful for understanding the different phases, they don’t help us think about our programming workflow. Often, the entire data science life cycle ends up as an arbitrary […]
Advancing your skills is an important part of being a data scientist. When starting out, you mostly focus on learning a programming language, proper use of third party tools, displaying visualizations, and the theoretical understanding of statistical algorithms. The next step is to test your skills on more difficult data sets. Sometimes these data sets […]
Ed Hawkins, a climate scientist, tweeted the following animated visualization in 2017 and captivated the world: This visualization shows the deviations from the average temperature between 1850 and 1900. It was reshared millions of times over Twitter and Facebook and a version of it was even shown at the opening ceremony for the Rio Olympics. […]
The Jupyter Notebook is an incredibly powerful tool for interactively developing and presenting data science projects. A notebook integrates code and its output into a single document that combines visualisations, narrative text, mathematical equations, and other rich media. The intuitive workflow promotes iterative and rapid development, making notebooks an increasingly popular choice at the heart […]
Keep this PDF Python cheat sheet nearby anytime you need to use regular expressions for your data science work, as a quick, handy reference.
These days, many businesses use cloud based services; as a result various companies have started building and providing such services. Amazon began the trend, with Amazon Web Services (AWS). While AWS began in 2006 as a side business, it now makes $14.5 billion in revenue each year. Other leaders in this area include: Google—Google Cloud […]