Home|

Category: Data Science Tutorials

Introduction to Python Ensembles

Stacking models in Python efficiently Ensembles have rapidly become one of the hottest and most popular methods in applied machine learning. Virtually every winning Kaggle solution features them, and many data science pipelines have ensembles in them. Put simply, ensembles combine predictions from different models to generate a final prediction, and the more models we […]

Read More

How to Set Up a Free Data Science Environment on Google Cloud

Whether you’re running out of memory on your local machine or simply want your code to run faster on a more powerful machine, there are many benefits to doing data science on a cloud server. A cloud server is really just a computer, like the one you’re using now, that’s located elsewhere. In this post, […]

Read More

Tutorial: Learning Curves for Machine Learning in Python

This Python data science tutorial uses a real-world data set to teach you how to diagnose and reduce bias and variance in machine learning.

Read More

Adding Axis Labels to Plots With pandas

Pandas plotting methods provide an easy way to plot pandas objects. Often though, you’d like to add axis labels, which involves understanding the intricacies of Matplotlib syntax. Thankfully, there’s a way to do this entirely using pandas. Let’s start by importing the required libraries: import pandas as pd import numpy as np import matplotlib.pyplot as […]

Read More

Tutorial: Concatenation (Combining Data Tables) with Pandas and Python

In this tutorial, we walk through several methods of combining data tables (concatenation) using pandas and Python, working with labor market data.

Read More

Tutorial Using Excel with Python and Pandas

In this tutorial, we’ll learn to work with Excel files in Python using pandas — everything from setting up your computer to moving and visualizing data.

Read More

Setting Up the PyData Stack on Windows

The speed of modern electronic devices allows us to crunch large amounts of data at home. However, these devices require the right software in order to reach peak performance. Luckily, it’s now easier than ever to set up your own data science environment. One of the most popular stacks for data science is PyData, a […]

Read More

Kaggle Fundamentals: The Titanic Competition

Kaggle is a site where people create algorithms and compete against machine learning practitioners around the world. Your algorithm wins the competition if it’s the most accurate on a particular data set. Kaggle is a fun way to practice your machine learning skills. This tutorial is based on part of our free, four-part course: Kaggle […]

Read More

Tutorial: Loading Data into Postgres using Python and CSVs

This in-depth tutorial covers how to use Python and SQL to load data from CSV files into Postgres using the psycopg2 library.

Read More

Explore Happiness Data Using Python Pivot Tables

One of the biggest challenges when facing a new data set is knowing where to start and what to focus on. Being able to quickly summarize hundreds of rows and columns can save you a lot of time and frustration. A simple tool you can use to achieve this is a pivot table, which helps […]

Read More