Avatar
Josh Devlin
Author Archives: Josh Devlin

Want a Job in Data? Learn This.

Why mastering a 50-year-old programming language is the key to getting a data science job. SQL is old. There, I said it. I first heard about SQL in 1997. I was in high school, and as part of a computing class we were working with databases in Microsoft Access. The computers we used were outdated, […]

Adding Axis Labels to Plots With pandas

Pandas plotting methods provide an easy way to plot pandas objects. Often though, you’d like to add axis labels, which involves understanding the intricacies of Matplotlib syntax. Thankfully, there’s a way to do this entirely using pandas. Let’s start by importing the required libraries: import pandas as pd import numpy as np import matplotlib.pyplot as […]

Kaggle Fundamentals: The Titanic Competition

Kaggle is a site where people create algorithms and compete against machine learning practitioners around the world. Your algorithm wins the competition if it’s the most accurate on a particular data set. Kaggle is a fun way to practice your machine learning skills. This tutorial is based on part of our free, four-part course: Kaggle […]

Machine Learning Fundamentals: Predicting Airbnb Prices

Machine learning is easily one of the biggest buzzwords in tech right now. Over the past three years, Google searches for “machine learning” have increased by over 350%. But understanding machine learning can be difficult — you either use pre-built packages that act like ‘black boxes’ where you pass in data and magic comes out […]

Python Cheat Sheet for Data Science: Intermediate

The printable version of this cheat sheet The tough thing about learning data is remembering all the syntax. While at Dataquest we advocate getting used to consulting the Python documentation, sometimes it’s nice to have a handy reference, so we’ve put together this cheat sheet to help you out! This cheat sheet is the companion […]

How to Get Your First Job as a Data Scientist.

Many aspiring data scientists focus on doing Kaggle competitions as a way to build their portfolios. Kaggle is an excellent way to practice, but it should only be one of many avenues you use to work on data science projects. This is because Kaggle competitions only focus on a narrow part of data science work. […]

Introducing our new Interface

Our new mission design has arrived! Over the past few months we’ve been tirelessly talking to students like you to learn how we can improve the mission interface. Today we are unveiling the results of this hard work. Since a lot has changed, we wanted to take a moment to describe the big changes and […]

Using pandas with Large Data Sets

Tips for reducing memory usage by up to 90% When working using pandas with small data (under 100 megabytes), performance is rarely a problem. When we move to larger data (100 megabytes to multiple gigabytes), performance issues can make run times much longer, and cause code to fail entirely due to insufficient memory. While tools […]

Python Cheat Sheet for Data Science: Basics

It’s common when first learning Python for Data Science to have trouble remembering all the syntax that you need. While at Dataquest we advocate getting used to consulting the Python documentation, sometimes it’s nice to have a handy reference, so we’ve put together this cheat sheet to help you out! This cheat sheet is the […]

Should I Learn Python 2 or 3?

One of the biggest sources of confusion and misinformation for people wanting to learn Python is which version they should learn. Should I learn Python 2.x or Python 3.x? Indeed, this is one of the questions we are asked most often at Dataquest, where we teach Python as part of our Data Science curriculum. This […]

Harry: “Dataquest helped me start my career in data”

While working as a geophysicist for an oil services company, Harry Robinson found himself interested in data. “My job involved lots of data, but it was always at arms length. We were applying algorithms, but I never got to see them. “I wanted to know what was happening and why, so I could interpret the results.” He decided […]

NumPy Cheat Sheet — Python for Data Science

NumPy is the library that gives Python its ability to work with data at speed. Originally, launched in 1995 as ‘Numeric,’ NumPy is the foundation on which many important Python data science libraries are built, including Pandas, SciPy and scikit-learn. It’s common when first learning NumPy to have trouble remembering all the functions and methods […]

Kyle: “Dataquest helped me get into the tech industry”

For the first four years of his career, Kyle Stewart worked as a product manager in industrial automation. “I was working for a fortune 500 company. I managed products that helped industrial processes, like at an oil refinery.” He wanted to move into the more dynamic tech industry. “In industrial product management it’s difficult to make […]

What’s New in v1.14: Data Engineering Path & Performance Improvements!

Our latest Dataquest release has over 20 new features, including many major performance improvements and the launch of our much-anticipated data engineering path. New Path: Data Engineering The first course in our Data Engineering Path is here! Data Engineering is a broad field which includes: Working with Big Data Architecting distributed systems Creating reliable pipelines […]

Pandas Cheat Sheet — Python for Data Science

Pandas is arguably the most important Python package for data science. Not only does it give you lots of methods and functions that make working with data easier, but it has been optimized for speed which gives you a significant advantage compared with working with numeric data using Python’s built-in functions. It’s common when first […]

Preparing and Cleaning Data for Machine Learning

Cleaning and preparing data is a critical first step in any machine learning project. In this blog post, Dataquest student Daniel Osei takes us through examining a dataset, selecting columns for features, exploring the data visually and then encoding the features for machine learning. After first reading about Machine Learning on Quora in 2015, Daniel […]

Dong: “Dataquest helped me get a job I love”

After 4 years of working in postdoc positions, Dong Zhou was starting to re-evaluate academia. “It’s not a real job in terms of compensation and stability. I decided to quit postdoc and try working in industry.” Dong started to explore learning software development and data science. “I started off trying to learn with books, but I […]

Whats New in v1.10: Answer diffs, Improved Q&A!

Along with our two new data visualization courses (Exploratory Data Visualization and Storytelling Through Data Visualization) our latest release includes two major features designed to make your life easier — enhanced Q&A and answer diffing. Introducing: Output & Variable Diffing When you’re learning to code, it can be frustrating to be stuck on an exercise […]

What’s New in Dataquest v1.9: Console, hotkeys, and more!

Whenever you send us feedback or an ideas for a feature, we read and catalogue your suggestions. We then use this to help planning features and improvements for Dataquest. Today we’re excited to launch two of our most-requested features: Hotkeys and a Python Console. Introducing the Python console Many of you have told us that […]

28 Jupyter Notebook tips, tricks, and shortcuts

This post is based on a post that originally appeared on Alex Rogozhnikov’s blog, ‘Brilliantly Wrong’. We have expanded the post and will continue to do so over time — if you have a suggestion please let us know. Thanks to Alex for graciously letting us republish his work here. Jupyter Notebook Jupyter notebook, formerly […]

Franco: “Dataquest helped me become a Data Scientist.”

While working as a Human Resources Analyst, Franco used Excel to analyze data. “We found that our employees were staying late to avoid traffic”, he explains. “After realizing this, we created different work schedules. This let employees select their schedules to better meet their needs.” While the outcome of their analysis was useful, their team of […]

Share On Facebook
Share On Twitter
Share On Linkedin
Share On Reddit