When I launched Dataquest a little under two years ago, one of the first things I did was write a blog post about why. At the time, if you wanted to become a data scientist, you were confronted with dozens of courses on sites like edX or Coursera with no easy path to getting a job.
I saw many promising students give up on learning data science because they got stuck in a loop of taking the same courses over and over. There were two main barriers to learning data science that I was trying to solve with Dataquest: the challenge of getting from theory to application, and the challenge of knowing what to learn next.
I strongly believe that everyone deserves a chance to do work that they find interesting, and Dataquest was a way to put that belief into action and help others get a toehold in a difficult field. Over the past two years, we've made it simple to learn all of the skills you need for a data science role in one place. From basic Python to SQL to Machine Learning, Dataquest teaches you the right skills, and helps you build a portfolio of projects along the way.
As we've built the site, we've learned quite a few lessons on how to most effectively help our students. We've been gradually increasing the scope of our initial vision. In this post, I want to outline what we're focused on now, and where we're headed. Along the way, I hope to make the case for why Dataquest is the place you should be learning data science.
Two Years Of Observations
It's a common refrain that learning is its own reward. Massively Open Online Course (MOOC) sites like the aforementioned edX and Coursera were created with this wisdom in mind. What we've found instead is that our students are learning data science because they enjoy it and because they want more interesting jobs.
This observation has pushed us to become more career-focused. The most common thing students want is a better path to data science careers, and we feel that it's the highest leverage thing we can work on.
As we help people get ready for new careers, we've made four key observations:
- Focus is critical to retaining knowledge, especially when you have limited time
- Motivation is the most important determinant of whether you'll get a job
- It's easy to get "stuck" and frustrated — timely help is key
- There isn't a lot of good career advice and interview preparation help
Let's dive into each of these observations in more depth, and see how they've affected our thinking.
When you're learning data science, it's tempting to get lost in a sea of tools. You're told that you have to learn R, Python, Spark, and Tensorflow. If you don't, you're not a "real" data scientist.
What we've found instead is that the students who end up getting jobs focus on concepts over tools. If you learn how to implement a random forest from scratch, and know the tradeoffs involved in training it, it doesn't matter if you use Python, Scala, or R to make predictions. Concepts generalize between tools; if you learn a concept well, you can use any tool to implement it. If you can fit a decision tree model in R, you'll have some job prospects, but if you deeply understand the model and how it works, you'll have an order of magnitude more.
Focusing on a few concepts at a time and mastering them before moving on is key to retaining knowledge. We've kept Dataquest extremely focused, so knowledge sinks in. We have a linear curriculum that takes you from no programming knowledge all the way to advanced machine learning. Because we develop the entire curriculum, we're able to teach things in logical order, and make sure you're never lost. Our consistent style and focus on concepts mean that you can stay focused on learning one concept at a time.
The beginning of our data science roadmap.
It's often taught in school that it's a teacher's job to teach, and your job to be motivated. But if you're unmotivated, even a teacher who knows the material well won't be effective. We've found is that motivation is the single biggest difference between students who get jobs, and those who don't. It's not enough to just "check the boxes" and get certificates. You have to build projects to demonstrate your skills, and build a portfolio. In order to be motivated build effective projects, you have to genuinely enjoy working with data. As I wrote in a blog post on how to learn data science, a prerequisite for learning data science is finding problems that interest and motivate you.
At Dataquest, we've realized that it's our job to be motivating, and we've oriented the site around it. We've designed our curriculum to interleave dozens of interesting data sets, including data on CIA interventions and NBA player stats. When you're ready for them, we include dozens of interesting projects exploring topics like how to win Jeopardy and stock price forecasting. By focusing on engaging and motivating you, we help you get further in your journey to get a data science job.
A guided project where you analyze discrepancies in scores between movie review sites.
When it comes to more open-ended projects, we've found that students need help getting "unstuck". Being stuck can range from not knowing how to install a package to having trouble conceptualizing the structure of the data. Students often don't need major help — just a small nudge in the right direction or a confidence boost can be invaluable.
We've realized that as these small moments of frustration when you're stuck pile up, they decrease your motivation, and make it more likely that you won't reach your goals. We've designed systems that ensure you can get help either from a mentor or peers to avoid this frustration. We help students directly with 1:1 mentorship and office hours. We've created a strong community where students help each other learn and avoid common pitfalls.
We've noticed that many of our students have career questions, which range from wondering what skills they should learn to be most marketable to employers, to what questions might be asked in an interview, to what their portfolio should look like. Many of these questions are best answered by peers, and we've encouraged students to help each other advance their careers. We also offer office hours to help more directly with career questions.
There's quite a bit more we want to do to help students in this area, though. We want to offer everything from helping students understand what working at specific companies is like to reviewing portfolios. Career advice is an area we're actively expanding and working on, and we'll be introducing some exciting additions to Dataquest soon!
Based on the above observations, we've realized that the ideal data science learning tool:
- Gives you a roadmap for learning data science
- Allows you to practice skills by coding in the browser
- Teaches advanced concepts in an applied fashion
- Helps you build your portfolio with projects
- Gives you support along the way with mentor and community help
- Guides you on career choices and helps you find potential employers
There's currently no tool that does all of the above, although some (including Dataquest) cover several. We have the most work to do at the end of the learning journey, when students want to build more advanced projects and look for jobs. Let's go through each point, and talk about where we're at with it.
1. Data science roadmap
A roadmap for data science lets you stay focused and on track, without having to figure out which course to take next. With our comprehensive paths that teach you how to be a data analyst or a data scientist, we take you through all the material you need to know and help you build projects, all in a clear, consistent way that's designed to help you get the job you want.
This year, we plan to develop more roadmaps, for fields like Data Engineering.
2. In-browser coding
It's amazing how long installing packages like pandas or tools like Spark can take when you're a beginner. We let you get your feet wet in the browser, and teach you all the skills. Afterwards, when you understand them better, we help you get everything setup on your own computer so you can work on your own.
We also score your answers in the browser, so you know when you're on track. We've found that in-browser practice is very motivating, and helps people hit their goals.
This year, we plan to add better ways to practice concepts using spaced repetition and other methods that help complex topics really sink in.
3. Applied concepts
Our missions teach you data science concepts like decision trees by having you work through interesting datasets. You might work through data on airline accidents, or educational achievement worldwide. Once you've learned the skills, you're able to apply them with projects that use more interesting datasets. This loop of learning then application helps you quickly develop and solidify your skills.
We plan to use larger and more varied datasets this year, including audio, video, and image data.
We help you build a portfolio of projects. Not only does this help you practice and learn concepts, it also helps you get job interviews! Hiring managers are increasingly looking at portfolios when making decisions on who to interview. Even interviews have moved more towards projects as a means of assessment — you might get a take-home or in person project as part of your interview.
This year, we plan to have more open-ended projects, and offer more help with them. Imagine creating a bot that can have a conversation with you in together with your peers as part of a project!
5. Support along the way
Right now, you can get help from other students learning on Dataquest, or from our teachers. This help is critical in keeping you focused and motivated.
This year, we plan to offer more hands-on help, including reviewing projects and assigning group work.
6. Career help
Right now, we give career advice during office hours, where we have 1:1 conversations with students. This year, we plan to develop a more robust careers section that helps you figure out how and where to interview. We also plan to more directly help you with your job search.
As you've read, you can expect a lot of improvements for Dataquest in 2017. But progress doesn't happen all at once, it happens regularly, as we constantly tweak the Dataquest experience. We now have a regular release schedule, and you can expect substantial improvements every month. Take a look at the posts for our past two releases for an idea of how quickly Dataquest is evolving.
In the next 3 months, you can expect:
- An easier to use coding interface
- Portfolio reviews and feedback
- Improved statistics and machine learning content
- The launch of the long-awaited Data Engineering learning path
- Better ways to practice concepts you've learned