A Year Learning Data Science at Dataquest
What does a year of Dataquest actually get you? How long does it take to learn data science on Dataquest? These are questions we get asked a lot.
Every student is different, so these questions can be tough to answer. Learning on Dataquest is self-paced and some students move much more quickly than others. But in this article, we’ll outline a “typical” learning progression.
Specifically, this article will give you an idea of what you can expect to learn, and what kind of jobs you might be able to apply for, as you work through a year of learning data science with us.
Of course, one of the advantages of an online course like Dataquest’s is that you can work at your own pace and tailor your study to your background, skipping courses with content you’re already comfortable with. For the purposes of this article, we’re going to make some conservative assumptions:
- You’re busy, and can only dedicate around five hours per week to your studies.
- You have no previous programming experience.
- You have no math training (beyond high-school algebra).
- You’ve signed up for a Premium subscription (which gives you access to all Dataquest courses and projects, Premium support, career counseling, and more).
A lot of our students spend more than five hours per week studying, so this is a pretty conservative estimate of what you can get done in a year.
But if you put in at least five hours a week, we expect that in a year, you could finish the Data Analyst path, or get more than halfway through the Data Scientist path, and would be qualified for a variety of entry-level data analysis and data science jobs.
Let’s take a closer look at what that year would look like, what you would learn on the Data Scientist in Python path, and how you could best take advantage of your subscription over the course of the year. (Your experience on the Data Analyst path in Python or R would be very similar, however).
A Year of Dataquest
- January-February (Weeks 1-8): Learn Python
- March-May (Weeks 9-20): Data Cleaning, Data Analysis, and Data Visualization
- May-July (Weeks 21-28): Command Line, Version Control, and Git
- July-October (Weeks 29-40): Learn SQL, APIs, and Web Scraping
- October-December (Weeks 41-50): Statistics for Data Science
- Continuing Your Data Science Journey
January-February: Learning Python
The first eight weeks of your year would likely be spent learning Python. You might be able to get through our introductory and intermediate Python courses a little bit faster if you rush, but building a solid Python foundation is important for almost everything that comes afterwards. It’s worth taking a little extra time here to be sure you can understand and apply all the concepts.
The good news is that even if you started these eight weeks with zero coding experience, you’re going to end them as a programmer. After these courses, you’ll be able to confidently apply most of the important concepts of Python programming (from basics like functions and for loops to more advanced concepts like regular expressions and list comprehensions), and you’ll also be comfortable working with Jupyter Notebooks, an important tool for data scientists who use Python.
As you learn those skills and techniques, you will also have gotten a great introduction to the fundamentals of data analysis in Python. All of our courses have you working with real-world data, and as part of these courses you’ll get to apply what you’ve learned doing guided projects analyzing what app store profiles lead to more app downloads and what successful Hacker News posts have in common with each other.
These two classes alone won’t be enough to get you a job in data science, but by the end of the eight weeks, you’ll find that you’ve learned enough to do some basic data analysis on your own, and probably code some other things, too! Just these eight weeks would be enough to give you some skills that might help you save some time on analytical tasks in your current job.
These first eight weeks will are also a good time for you to establish a presence in our data science learning community. There, you can get help from your fellow students, as well as our data scientists and our career counselor. If you get stuck, this community is a great way to get yourself unstuck fast, and that’s important, particularly early in the learning process.
These courses are the foundation upon which the rest of your data science “house” will be built, so being extra thorough here will pay dividends later. If you don’t understand something, ask!
March-May: Data Cleaning, Data Analysis, and Data Visualization
These twelve weeks are where the rubber really starts to meet the road in applying your new Python skills to accomplish typical data science tasks. You’ll go through four courses here, and each one of them is crucial for doing data science.
In the first course, Pandas and NumPy Fundamentals, you’ll learn how to use the pandas library, a crucial tool for real-world data analysis tasks. You’ll also learn about NumPy, another useful Python package, and you’ll learn to make them play nicely together. Then you’ll apply that learning with a guided project analyzing real-world eBay car sales data.
From there, you’ll move into two courses about data visualization. The first, Exploratory Data Visualization, will teach you how to use the matplotlib package together with pandas to do exploratory visualizations that will help you make sense of your data and guide you in your analysis.
The second, Storytelling Through Data Visualization, will teach you more about how to make aesthetic, readable charts using Seaborn to ensure that you know how to communicate your data clearly to others (a crucial skill in any data science job). In these courses, you’ll synthesize what you’ve learned in guided projects analyzing topics like the gender gap in college degrees and geographical flight patterns (all using real-world data, of course).
Finally, you’ll move into two courses on data cleaning, one of the most un-sexy but essential skills in any data scientist’s toolkit. You’ll learn to explore and clean datasets, how to combine multiple datasets into a single, clean source, and work through some guided projects analyzing data from NYC high schools and a survey about Star Wars.
By mid-May (your 20th week), you’ll have acquired many of the foundational data science skills, and you should be well-equipped to start taking on your own data science projects. You might not be ready for a full-time data science job just yet, but you’ll know enough to be able to solve real-world problems with data science in a way that might impact your current job.
For example, Dataquest student Curtly Critchlow was able to take an Excel data analysis nightmare that took him a full week of work each month and turn it into a project that took just a few minutes after he finished our Pandas and NumPy course.
During these weeks, though, you may encounter a psychological phenomenon sometimes referred to as ‘The Dip’. This happens often in the course of learning a new skill; once you get beyond the beginner phase, big gains come a bit more slowly, and the novelty of studying something new has worn off. The result can be a bit of a dip in your natural level of motivation.
But don’t worry: we’ll help you fight the dip! All of our courses use interesting, real-world data to combat this effect by keeping you interested in the analysis, and you’ll be solving different and interesting problems in each course.
May-July: Learning the Command Line and SQL
As we get towards the middle of our year of data science, it’s time to cover some skills that are hugely important for working in data science: operating with the command line and working with SQL.
In the first two courses, you’ll learn to work with the command line. You’ll get comfortable navigating around without the use of a GUI, and working with Python scripts and packages from the command line. Then you’ll move on to more advanced topics, with a focus on processing text in the command line.
From there, you’ll start digging into our three SQL courses. In the first, you’ll learn the basics, like how to explore and analyze data in SQL, and how to use SQLite with Python. Then you’ll move into more intermediate topics like querying across multiple tables, and you’ll begin getting practice answering business questions using SQL. Finally, you’ll dig into the advanced stuff, like PostgreSQL and using database indexes to speed up your SQL queries.
And while your new SQL skills will be crucial for working with most of the databases out there, there are plenty of other data sources you’ll want to work with, so after the SQL courses you’ll move into a course on APIs and web scraping that’ll teach you how to query APIs and scrape data from websites that don’t have APIs.
To cement these skills, you’ll answer some more real-world business questions with SQL, and dive into data from the CIA World Factbook.
At this point in your study, it’s a great time to start thinking about portfolio projects. Having a GitHub or some other portfolio page with compelling projects is key to landing a job in data science, and Dataquest is full of guided projects that you can absolutely use for a portfolio. You’ll have worked through some of these already, so this is a good moment to look back and think about adding some polish to your favorites so you’ve got some cool projects ready when you start applying for jobs.
July-October: Learn Statistics
By this point, you’ll have the programming skills to do a lot of data analysis, but you still need a solid understanding of statistics and probability to be able to get the most of of them, so in the final section of your year of Dataquest, you’ll take a sequence of four courses aimed at giving you a solid stats foundation and helping you apply these concepts in Python.
You’ll start with the basics, like learning different sampling techniques for taking good samples from your data. Then you’ll start looking at distributions, measuring variability, and locating and comparing values with z-scores. Finally, you’ll learn more about probability and dig into advanced topics like significance testing and the chi-squared test.
As usual, as you work through these courses, you’ll be using real-world data to answer interesting questions, like how a bike-sharing company can anticipate rental patterns. And you’ll be able to apply your new skills to cool guided projects like figuring out winning Jeopardy strategies and determining whether a movie ratings site’s ratings are biased.
It might be possible to work through these courses in less than the three months we’ve allotted, but we’d suggest that you take your time here and really be sure you understand everything. While it’s easy to check references if you forget how to do something in Python or SQL, misunderstanding the math that underlies your analysis could have more serious consequences, and it’s harder to catch.
This is a great time to branch out a bit more and start making some connections in the data science community. Our own community is a good place to start if you’re not already active there. You may also want to work on building your brand as a data scientist by getting yourself out there in other ways, like by writing a tutorial for the Dataquest blog.
If you’re interested in data analyst positions, you can begin applying for jobs at any point during these months. Having Python, SQL, and statistics skills will qualify you for most data analyst positions.
October-December: Dig Into Machine Learning
By this point, you’ll have the programming skills to do a lot of data analysis, and the statistics skills to understand what’s happening under the hood. But if you want to work as a data scientist, you need to add one more big skill to your skill set: machine learning.
Over these months, you’ll start digging into our machine learning course offerings. You’ll start with the fundamentals of machine learning, and then you’ll learn about some important calculus and linear algebra concepts that underpin key machine learning algorithms.
Learning at this relatively slow pace, it’s unlikely you’ll make it all the way to the end of our Data Scientist path in this time frame, and there are still quite a few courses on cool topics like Natural Language Processing and Adobe Spark in your future.
But once you’ve gotten to this point, you should be ready to start applying for entry-level data science jobs. You have the programming skills, the statistics knowledge, and a strong foothold in machine learning. And while you should always plan to continue learning, these skills — and the project portfolio you’ve built while acquiring them — already make you a compelling candidate.
If you’re not sure how to start the job hunt, reading through our Data Science Career Guide would be a great start!
This Is Just the Beginning of Your Data Science Journey
Assuming you’ve stuck to just five hours per week, you’re likely to have completed the Data Analyst path and gotten a strong foothold in the machine learning section of the Data Scientist path by the end of your first year of study.
At this point, you’ll be well-qualified to apply for data analyst positions. We have many students like Pol Brigneti, who’ve finished our Data Analyst path and found full-time data analyst positions. If that’s your choice, then you don’t need to worry much about machine learning skills, and you can spend the extra three months building cool projects, applying for jobs, and adding new skills to your skill set. (For example, since you’ve been learning Python, it couldn’t hurt to also learn the fundamentals of R just in case you come across jobs that prefer that).
You’ll also be ready to start applying for entry-level data science positions and internships too, although there’s plenty more to learn in the realm of machine learning and more advanced topics that are covered later in our data science path.
And remember, this is a pretty conservative estimate. Spending a little more time each week studying will get you further, faster. At around 10 hours per week, we estimate you’d be able to finish the entire Data Scientist path in a year.
Even if you don’t aspire to go all the way through the Data Science path, it pays to keep learning while you’re searching for jobs, and even after you find employment. That’s what Miguel told us after he got his full-time job just halfway through our Data Science path. “Even though I’m starting a job in January, I’m still going to be active, and I’ll keep on studying, because obviously I want to reach other paths.”
“And I still think Dataquest is the best option,” he told us. “If I had to choose only one, I’d choose Dataquest.”