What Do You Really Need to Learn for Data Science? It Depends.
If you’re considering a new career in the field of data science, the options can feel simultaneously limiting and overwhelming.
On the one hand, it can seem like the entire industry boils down to “data analyst”, “data engineer”, and “data scientist”, and you have to fit into one of those three roles.
On the other hand, it can be seriously overwhelming, because the number of things you’re told you have to learn is massive. Consider, for example, this popular infographic on learning data science:
This looks cool…but also incredibly intimidating and confusing!
Cool design? Absolutely. But that’s the kind of image that might cause a heart attack for aspiring data scientists who believe they actually have to know all of that to get a job in the industry.
That’s why a recent Twitter thread from Insight Data Science Program Director Alise Otilia R. about the variety of specialties and roles in the data science field felt like a breath of fresh air. So we caught up with Alise to learn more about her perspective on the opportunities available in the data science field.
Do You Have to Be An Expert in Everything?
“My inspiration [for that thread] was definitely because somebody transitioning to the field of data science could be overwhelmed by all of the things that you have to learn to be a data scientist,” says Alise, who was a Data Scientist at Spirit Airlines before moving to Insight Data Science.
And there definitely are a lot of things,” she says. “It is a very interdisciplinary field, so computer science, programming, statistics, experimentation, business, SQL — there’s just so many things that you have to know to some extent.”
“But from my own experience interviewing as a data scientist, every single interview has been completely different. Some were very business-case, SQL heavy. Others were just case studies of machine learning. Others were onsite and strictly CS algorithms.”
“I started to realize that’s kind of where the frustration comes in,” she says.
“You feel like you have to be an expert in all of these different domains, which is pretty impossible, honestly.”
Part of the problem, Alise says, is that companies don’t agree on what a “data scientist” actually is. And those differences can be really overwhelming for someone who’s just entering the field and feels like they have to tick every skill box on every job posting to be qualified for any of them.
Drilling Down and Finding Your Passion
In her Twitter thread, Alise called out seven more specific roles that all might fall under the broad category of “data scientist”:
- Product Analytics Data Scientist
- Specialist (NLP, Computer Vision, etc.)
- Machine Learning Engineer
- DataOps Engineer
- Data Science Product Manager
- Data Visualizer
“To me, you really have to find where you fit in, and each of these have a strength,” Alise says. “All of them have a baseline fundamental of the core skills, but I think you have to gravitate towards what your strengths are when determining the best fit career for you within data science. And that can help alleviate some of the pain of feeling like you have to study everything.”
For example, Alise says, “I would consider myself more of a product-analytics-based data scientist. I just love the business aspect. I love being able to affect the direction that a business is taking, communicating to nontechnical members of teams. SQL’s definitely one of my stronger skill sets. Experimentation has always been interesting to me, just to see how a small change can affect the behavior of users.”
It’s important, she says, to find your own passion. That can help you drill down and focus on the skills that will best serve you for the specific roles you want. SQL and communication skills will be critical for a data scientist focused on product analytics, she says, but they might not be so important for a machine learning engineer.
“If you can sit down and code for eight hours straight, and that’s a great day for you, I would say something along this line [machine learning engineer] is more of a fit for you.” If that’s your goal, then a greater focus on programming fundamentals and software engineering practices would be warranted.
The same principle applies to the other roles Alise listed. If your passion is data visualization, for example, you probably don’t need to master the intricacies of every ML algorithm, but you will need to be familiar with all of the popular visualization libraries in your language of choice, and some design and communication skills will be needed, too.
If your passion is maintaining the stability of production machine learning models, then you might be a good fit for DataOps, Alise says. You won’t need design skills or in-depth visualization library expertise for that, but you’ll probably want to learn tools like Docker and Apache Airflow.
In other words: while there’s a lot to learn in data science, you really don’t need to learn all of it to be successful. You don’t have to be scared by that subway map. You just need to learn the fundamentals, and then the core skills that are relevant to the data work you actually want to do.
And if your passions fit in multiple categories? Not to worry! “There definitely are hybrids,” she says, “so don’t feel like you have fit into any one of these [job roles] neatly. A role can be any mixture of any of these buckets, for sure.”
Finding your passion will help lead you towards a career path.
Being Strategic on the Data Science Job Hunt
“Definitely don’t get overwhelmed by all the certifications and courses that exist,” Alise says. Instead, “ask yourself a series of questions, like why are you interested in data science to begin with? What really intrigued you about the field?”
“I think getting a grasp of what interested you in data science in the first place is really important,” she says. “Definitely focus on the fundamentals; you don’t have to be an expert in everything. Know what your strengths are.”
Then, she says, when it comes time for the job hunt, read closely and be strategic. “Don’t just apply for everything that says ‘data scientist’, because most roles won’t explicitly say this is machine learning or this is purely product analytics.”
“If you know which [type of data role] you are most interested in or more passionate about, I would really analyze the job description,” she says. Even if the job title is “data scientist,” you can often get hints from the required skills and other elements of the job description about whether they’re looking for a machine learning expert, a data viz wizard, a product analytics guru, or a generalist with a little experience in all of the above.
“Pick up on those keywords and phrases that are emphasized,” Alise says. “Does it say that you will be communicating effectively? Does it say that you will be creating reports or metrics? Things along those lines that will kind of help you to define which bucket does this role really fit into.”
If you reach an interview, Alise says, you can dig further into this by asking questions about the day-to-day responsibilities of the role. “Ask what percentage of the time you will be programming versus in meetings or presenting or creating a report, or doing ad hoc analysis versus machine learning. I think that’s a really good way to kind of understand how your time will be spent overall in this role.”
“If you gravitate toward the answers, then maybe that job is a good fit, she says. “But if they say 80% of the time you’re going to be programming, but you really don’t see programming as one of your main strengths, or you’re really not interested in it, maybe that’s just not the data science role for you.”
“Really understanding where you fit in is the best thing, and then you can definitely tailor how to prepare for those roles and those interviews a little bit better.”
Not Sure Where to Start? Start Small
Another thing to consider, Alise says, is company size. And starting at a smaller company could be beneficial if you’re not exactly sure what you like.
“I think the size of the companies is very much an indicator of what you’re going to be doing as well. For startups and mid-sized companies even, I would say that most of the time you’re going to be doing a hybrid of all of those roles.”
Is it better for you to work at a startup or join a more established data team?
“I know in my time at Spirit there were only two data scientists, myself included,” she says, “so we were basically the data engineers creating pipelines. We were the business analysts and the product analytics data scientists, as well as the machine learning engineers. We were pretty much almost everything all in one.”
That kind of role “can definitely lead to burnout, which is a huge thing in data science, and can definitely happen very quickly,” she says. “But if you don’t know exactly where you fit in, maybe a role like that could be interesting. You can quickly learn what you like versus what you don’t like.”
For example, she says, “I quickly learned that machine learning engineer was not for me, but even data engineering and product analytics, I loved that part of it.”
On the other hand, if you do know for sure what you want to do, a bigger company with a larger data team might be the safest bet. “In large companies that have teams of maybe hundreds of data scientists,” she says, “you’re likely only working on a small part of a project or a model, so your role overall will be a bit more specific.”
You should also consider industry when you’re on the job hunt, she says. An industry like tech that’s already full of data scientists might be a good place to find specific roles, as companies are more likely to know exactly what they need and how they want to get it. You’re likely to also find better mentorship opportunities at a company that has an established data science team.
On the other hand, industries that are just starting to dip their toes into data science might offer more open-ended roles with more freedom to experiment. “I kind of liked going into a field or an industry where there weren’t as many data scientists,” Alise said about getting her start in the aviation industry, “just because I was able to kind of land and try a lot of things, and just kind of see where I fit in a little bit better, rather than being pigeonholed into a certain project or a certain application or a certain responsibility.
“I liked being autonomous and being able to play around with stuff, learning on my own.”
There’s a lot to learn in data science! You don’t have to learn everything, but you do have to remain open to learning new things and improving your skills regularly.
The Importance of Lifelong Learning
Another important thing for aspiring data scientists to know, Alise says, is that there’s not really an endpoint for learning data skills.
“You’re never going to be on top of everything in data science,” she says. “It’s kind of like emails, like when you try to get down to zero, and then five minutes later, you refresh and there are 10 more. That’s how I see data science. This is a field where you’re constantly having to learn, read.”
“I’ve been in the field for a couple of years, and I still learn something new every day,” Alise says.
However, she adds, “Don’t get overwhelmed by focusing on all the new tools and technologies that are coming out. Focus on the fundamentals, because those are never going to change, and that’s ultimately what you’re going to be tested on in your interviews.”
Want more wisdom from Alise? You can find her on Twitter and LinkedIn. She’s also open to being a mentor for people, especially women, who are interested in transitioning into data science. You can book a free appointment with her to chat about that here.
Want to start learning those fundamentals Alise was talking about?
Learn Data Skills
Get that next raise or to switch to a career in data science by learning data skills.
Sign up for a free account and try our interactive courses in Python, R, SQL, and more!