Tag: Learn Python

Data Retrieval and Cleaning: Tracking Migratory Patterns

Advancing your skills is an important part of being a data scientist. When starting out, you mostly focus on learning a programming language, proper use of third party tools, displaying visualizations, and the theoretical understanding of statistical algorithms. The next step is to test your skills on more difficult data sets. Sometimes these data sets […]

Read More

Generating Climate Temperature Spirals in Python

In this tutorial, we’ll recreate Ed Hawkins’ climate spirals in Python using pandas and matplotlib.

Read More

Regex Cheat Sheet: A Quick Guide to Regular Expressions in Python

Keep this PDF Python cheat sheet nearby anytime you need to use regular expressions for your data science work, as a quick, handy reference.

Read More

Introduction to AWS for Data Scientists

These days, many businesses use cloud based services; as a result various companies have started building and providing such services. Amazon began the trend, with Amazon Web Services (AWS). While AWS began in 2006 as a side business, it now makes $14.5 billion in revenue each year. Other leaders in this area include: Google—Google Cloud […]

Read More

Tutorial: Python Functions and Functional Programming

Learn about functions in Python and master the basics of functional Python programming in this in-depth tutorial for data scientists and programmers.

Read More

Introduction to Python Ensembles

Stacking models in Python efficiently Ensembles have rapidly become one of the hottest and most popular methods in applied machine learning. Virtually every winning Kaggle solution features them, and many data science pipelines have ensembles in them. Put simply, ensembles combine predictions from different models to generate a final prediction, and the more models we […]

Read More

Postgres Internals: Building a Description Tool

In previous blog posts, we have described the Postgres database and ways to interact with it using Python. Those posts provided the basics, but if you want to work with databases in production systems, then it is necessary to know how to make your queries faster and more efficient. To understand what efficiency means in […]

Read More

Tutorial: Learning Curves for Machine Learning in Python

This Python data science tutorial uses a real-world data set to teach you how to diagnose and reduce bias and variance in machine learning.

Read More

Adding Axis Labels to Plots With pandas

Pandas plotting methods provide an easy way to plot pandas objects. Often though, you’d like to add axis labels, which involves understanding the intricacies of Matplotlib syntax. Thankfully, there’s a way to do this entirely using pandas. Let’s start by importing the required libraries: import pandas as pd import numpy as np import matplotlib.pyplot as […]

Read More

Tutorial: Concatenation (Combining Data Tables) with Pandas and Python

In this tutorial, we walk through several methods of combining data tables (concatenation) using pandas and Python, working with labor market data.

Read More

Tutorial Using Excel with Python and Pandas

In this tutorial, we’ll learn to work with Excel files in Python using pandas — everything from setting up your computer to moving and visualizing data.

Read More

Setting Up the PyData Stack on Windows

The speed of modern electronic devices allows us to crunch large amounts of data at home. However, these devices require the right software in order to reach peak performance. Luckily, it’s now easier than ever to set up your own data science environment. One of the most popular stacks for data science is PyData, a […]

Read More

Kaggle Fundamentals: The Titanic Competition

Kaggle is a site where people create algorithms and compete against machine learning practitioners around the world. Your algorithm wins the competition if it’s the most accurate on a particular data set. Kaggle is a fun way to practice your machine learning skills. This tutorial is based on part of our free, four-part course: Kaggle […]

Read More

Tutorial: Loading Data into Postgres using Python and CSVs

This in-depth tutorial covers how to use Python and SQL to load data from CSV files into Postgres using the psycopg2 library.

Read More

Explore Happiness Data Using Python Pivot Tables

One of the biggest challenges when facing a new data set is knowing where to start and what to focus on. Being able to quickly summarize hundreds of rows and columns can save you a lot of time and frustration. A simple tool you can use to achieve this is a pivot table, which helps […]

Read More