How to Learn Data Science in 2022 (A CEO’s In-Depth Guide)
The demand for data scientists is at an all time high. If you’re considering a career in data science, now’s the best time to get started.
But what’s the best way to learn data science?
That’s a complicated question — I know from experience. A few years ago, I decided to pursue a data science career, but when I researched what I needed to learn, all I could find were long lists of data science courses to take and books to read.
However, studies show that most people learn best by doing, not by watching videos or memorizing textbooks.
So what’s the most effective way to learn data science? I’ve broken it down into five easy steps.
1. Find a reason to learn
The data science field is very broad, and there’s a tremendous amount of available information. That means it can be difficult to determine what you should focus on. The secret to navigating all this information is a reason to learn. Identify your motivation, and use it to guide your data journey.
For me, this was completing a stock market prediction program. To learn how to do this, I dove into statistics with a passion, and it helped motivate my learning experience. When you learn by doing, you retain information longer, and you gain experience you can rely on in the future.
Take control of your learning by tailoring it to your goals, not the other way around.
2. Nail the fundamentals
It’s tempting to get carried away learning specialized topics, like machine learning, neural networks, and image recognition. However, 90% of your work as a data scientist will be cleaning data. You can’t run before you learn to walk.
You’ll find more success if you master the simple stuff before spending your time on advanced topics. Learn linear regression, k-means clustering, and logistic regression, then use what you know to complete projects and build a portfolio.
This is why I created Dataquest the way I did. Projects are a critical part of becoming a data scientist, and employers will use your portfolio to evaluate you as a job candidate. Virtually every course we build at Dataquest offers a hands-on project you can complete to expand your portfolio.
Whether you’re just getting started learning data science or simply picking up a new skill (like SQL), we have the courses that will help you land your dream job. Check out our fun and interactive courses here.
3. Learn to communicate
Data scientists constantly need to present the results of their analyses. Knowing how to do this is the difference between being a mediocre data scientist and a great one. A data scientist is only as valuable as the insights you can share. That means you need to learn how to be a great communicator.
There are three aspects of communicating insights:
- Understand the topic. (You’ll never be able to explain something that you don’t understand yourself.)
- Organize your results.
- Explain your analysis.
It can be difficult to communicate complex concepts effectively, but here are some tips:
Start a blog. Post the results of your data analysis. Or submit a pitch and write for Dataquest’s blog!
Try to teach your less tech-savvy friends and family about computer science concepts. It’s amazing how much teaching can help you understand concepts.
Try to speak at meetups.
Use GitHub to host and share your analysis.
4. Learn from your peers
You can learn a lot working with others. It’s not unusual for a data scientist to move from team to team as they work on answering data questions from different departments. That makes collaboration essential for data scientists.
Here are some ideas to help you collaborate more effectively:
Find people to work with at meetups.
Contribute to open-source packages.
Message people who write interesting data analysis blogs to see if you can collaborate.
Try out Kaggle, a machine learning competition site, and see if you can find a teammate.
5. Increase the difficulty
When was the last time a project you were working on challenged you? Data science is a huge field. You’ll never understand all of it. But, the more you can learn, the more valuable you will be to the teams you work with.
If you’re becoming too comfortable with your projects, here are some ideas that can help you level-up your skill-set:
Take more advanced data science courses (Dataquest has these).
Work with a larger dataset. Learn to use Spark.
Try to make your algorithm faster.
How would you scale your algorithm to multiple processors? Can you do it?
Improve your understanding of the theory behind the algorithm you’re using.
Try to teach a novice to do the same things you’re doing now.
That last one is a really underrated challenge, and if you give it a try, you’ll quickly see how valuable teaching can be to someone who’s trying to learn.
You’ll likely come out of the experience with a deeper understanding of the topic than you had before, and you’ll improve your communication skills.
Becoming a Data Scientist: Common FAQs
Where do I find projects to get started?
Some ideas you might want to consider are new and interesting things about your city, mapping all the devices on the internet, finding the real positions NBA players play, or mapping refugees by year. The possibilities are limitless.
What’s something that interests you? This is probably a fantastic place to begin.
Another way to find a project is to look at datasets and see what types of questions you can come up with.
Here are some good places to find free datasets to get you started:
Do you need a data science certificate?
Now that you’re ready to start learning, you’re probably wondering if a data science certificate is worth your effort.
Certificate programs can be incredibly valuable if they can teach you the skills you need to effectively perform your job. When employers look at your resume, they’re looking at your skills, your project portfolio, and your relevant experience — certificates can help communicate this quickly.
Here’s some more information about data science certificates and whether or not you need one.
Do you need a degree in data science?
Having a data science degree on your resume might help you get a job. However, getting one typically takes years and costs tens, if not hundreds, of thousands of dollars.
Universities can also be subject to institutional inertia and are slow to adapt. This means you can end up wasting time studying older technologies that aren’t as relevant in the current business environment.
Thankfully, there are many examples of people who’ve successfully learned data science on their own. For example, I worked as a machine learning engineer at EdX before starting Dataquest. But I don’t have a degree in data science or machine learning. I taught myself those skills.
Our Dataquest learner stories are also full of learners who have industry jobs with zero background in programming and no data science degree. Our 2020 survey covered hundreds of respondents who’ve met their data science learning goals without a degree.
If you have the time and money to earn a university degree in data science, adding it to your resume can definitely help you. But it’s very possible to learn all of the necessary skills faster and much more affordably.
What skills do data scientists need to succeed?
Based on job postings and what data scientists report doing at work, the most fundamental data science skills include the following:
Programming in Python or R
Fluency with popular packages and workflows for data science tasks in your language of choice. If you choose Python, for example, you should be familiar with libraries like pandas, NumPy, matplotlib or Plotly, and scikit-learn — and you should be comfortable with cleaning, analyzing, and visualizing big data using them.
Writing SQL queries
Basic machine learning and modeling
Workflow and collaboration (Git, command line/bash, etc.)
If you can add these fundamentals to your skill-set, you’ll be in a great position to get your first data science job. For more information, you can take a look at our Data Scientist learning path. We designed it to teach all of the important data science skills for Python learners.
From there, you can dig deeper into specializations like Natural Language Processing, Image Classification, Deep Learning, and a variety of other options.
The bottom line
I generally dislike the “here’s a big list of stuff” approach, because it makes it difficult to determine what to do next. I’ve seen many people give up on learning when confronted with a giant list of textbooks and MOOCs.
I personally believe that anyone can learn data science if they approach it with the right frame of mind.
I’m also the founder of Dataquest, a site that helps you learn data science in your browser. It encapsulates a lot of the ideas discussed in this post to create a better learning experience. You learn by analyzing interesting datasets like CIA documents and NBA player stats. You also complete projects and build a portfolio as you work through our courses.
Don’t worry if you don’t know how to code — we teach both Python and R from scratch, no experience required! We teach Python and R because they’re beginner-friendly languages and because they’re the most popular languages used in real-world data science.
Some helpful data science resources
As I worked on projects, I found these resources helpful.
Dataquest — learn data science in your browser, complete projects, and build a portfolio.
Khan Academy — good basic statistics and linear algebra content.
Introduction to Linear Algebra, 4th Edition — Great linear algebra book by Gilbert Strang.
Calculus Online Textbook — also by Gilbert Strang. Great calculus book.
Elements of statistical learning — good machine learning book.
Andrew Ng’s Machine Learning Class — the original course is machine learning class. Mostly video-based.
OpenIntro Statistics — Good basic stats book.
Statsoft statistics textbook — Good for looking up statistics concepts.
If you’re ready to start learning data science and data analytics, Dataquest can help. Start your journey today.