SQL Intermediate: PostgreSQL, Subqueries, and more!

If you’re in the early phases of learning SQL and have completed one or more introductory-level courses, you’ve probably learned most of the basic fundamentals and possibly even some high-level database concepts. As you prepare to embark on the next phase of learning SQL, it’s important to not only understand SQL itself, but also the […]

Python Cheat Sheet for Data Science: Basics

It’s common when first learning Python for Data Science to have trouble remembering all the syntax that you need. While at Dataquest we advocate getting used to consulting the Python documentation, sometimes it’s nice to have a handy reference, so we’ve put together this cheat sheet to help you out! This cheat sheet is the […]

Should I Learn Python 2 or 3?

One of the biggest sources of confusion and misinformation for people wanting to learn Python is which version they should learn. Should I learn Python 2.x or Python 3.x? Indeed, this is one of the questions we are asked most often at Dataquest, where we teach Python as part of our Data Science curriculum. This […]

Harry: “Dataquest helped me start my career in data”

While working as a geophysicist for an oil services company, Harry Robinson found himself interested in data. “My job involved lots of data, but it was always at arms length. We were applying algorithms, but I never got to see them. “I wanted to know what was happening and why, so I could interpret the results.” He decided […]

The Tips and Tricks I used to succeed on Kaggle

I learned machine learning through competing in Kaggle competitions. I entered my first competitions in 2011, with almost no data science knowledge. I soon ended up in fifth place out of a hundred or so in a stock trading competition. Over the next year, I won several competitions on automated essay scoring and bond price […]

data science online course SQL basics

SQL Basics: Working with Databases

SQL, pronounced “sequel” (or ess-cue-ell, if you prefer), is a very important tool for data scientists to have in their repertoire. You may well have heard the name and wondered what it is, how it works and whether you should learn it. To put it simply, SQL (Structured Query Language) is the language of databases […]

How to become a data scientist

Data science is one of the most buzzed about fields right now, and data scientists are in extreme demand. And with good reason — data scientists are doing everything from creating self-driving cars to automatically captioning images. Given all the interesting applications, it makes sense that data science is a very sought-after career. Data science […]

NumPy Cheat Sheet — Python for Data Science

NumPy is the library that gives Python its ability to work with data at speed. Originally, launched in 1995 as ‘Numeric,’ NumPy is the foundation on which many important Python data science libraries are built, including Pandas, SciPy and scikit-learn. It’s common when first learning NumPy to have trouble remembering all the functions and methods […]

Kyle: “Dataquest helped me get into the tech industry”

For the first four years of his career, Kyle Stewart worked as a product manager in industrial automation. “I was working for a fortune 500 company. I managed products that helped industrial processes, like at an oil refinery.” He wanted to move into the more dynamic tech industry. “In industrial product management it’s difficult to make […]

What’s New in v1.14: Data Engineering Path & Performance Improvements!

Our latest Dataquest release has over 20 new features, including many major performance improvements and the launch of our much-anticipated data engineering path. New Path: Data Engineering The first course in our Data Engineering Path is here! Data Engineering is a broad field which includes: Working with Big Data Architecting distributed systems Creating reliable pipelines […]

Pandas Cheat Sheet — Python for Data Science

Pandas is arguably the most important Python package for data science. Not only does it give you lots of methods and functions that make working with data easier, but it has been optimized for speed which gives you a significant advantage compared with working with numeric data using Python’s built-in functions. It’s common when first […]