At Dataquest, we strongly advocate portfolio projects as a means of getting a first data science job. In this blog post, we’ll walk you through an example portfolio project. The project is part of our Statistics Intermediate: Averages and Variability course, and it assumes familiarity with: Sampling (populations, samples, sample representativity) Frequency distributions Box plots […]

At Dataquest, we strongly advocate portfolio projects as a means of getting your first data science job. In this blog post, we’ll walk you through an example portfolio project. The project is part of our Statistics Fundamentals course, and it assumes some familiarity with: Sampling (simple random sampling, populations, samples, parameters, statistics) Variables Frequency distributions […]

Pandas plotting methods provide an easy way to plot pandas objects. Often though, you’d like to add axis labels, which involves understanding the intricacies of Matplotlib syntax. Thankfully, there’s a way to do this entirely using pandas. Let’s start by importing the required libraries: import pandas as pd import numpy as np import matplotlib.pyplot as […]

Data science blogs provide an ideal forum to show off your work in job applications and to the public. Learn to build one with Pelican, Jupyter Notebook, and Github pages.

This is the first in a series of posts on how to build a Data Science Portfolio. You can find links to the other posts in this series at the bottom of the post. Data science companies are increasingly looking at portfolios when making hiring decisions. One of the reasons for this is that a […]

Analyzing Tweets with Pandas and Matplotlib Python has a variety of visualization libraries, including seaborn, networkx, and vispy. Most Python visualization libraries are based wholly or partially on matplotlib, which often makes it the first resort for making simple plots, and the last resort for making plots too complex to create in other libraries. In […]

In this Python programming and data science tutorial, learn to work with with large JSON files in Python using the Pandas library.

Learn how seven Python data visualization tools can be used together to perform exploratory data analysis and aid in data viz tasks.

Benjamin Root is a contributor to the Matplotlib data visualization library and focuses on improving documentation as well as the mplot3d toolkit within Matplotlib.

In this tutorial, we’ll guide you through the basic principles of machine learning, and how to get started with machine learning with Python.

Learn to use K-means clustering in Python with this free tutorial that walks you through how to plot members of the US Senate.