What is Data Science?
What is Data Science?
Data science is a field of study and practice that’s focused on obtaining insights from data.
Practitioners of data science use programming skills, statistics knowledge, and machine learning techniques to mine large data sets for patterns that can be used to analyze the past or even predict the future.
Math Meets Programming: A Quick History
To get a better idea of what data science really is, it’s helpful to take a quick look at where it comes from. In many ways, data science is the result of a merger between two fields that have been around for decades: statistics and computer science.
Statisticians, of course, have been crunching numbers for centuries. But the dawn of computer science in the mid 20th century provided statisticians with a new tool for analyzing data faster than had previously been possible.
As early as the 1960s, statisticians like John W. Tukey were theorizing about how computers could revolutionize the field, but their impact at the time was minimal — they were simply too slow and too expensive. In the 1980s, the rise of personal computers made digital data collection possible, and companies started collecting what they could. By the 1990s, some were successfully making use of that data to design marketing strategies. Analyzing these new digital data sets data required both the statistics knowledge of a statistician and the programming skills of a computer scientist.
By the early 2000s, thanks in part to the advent of the internet, many companies had access to mountains of data. At the same time, computer processing power had advanced to the point that complex analyses of huge data sets was possible, and more advanced techniques like predictive analytics with machine learning were coming into reach.
Both business and academia began to recognize the value of having experts with the programming skills required to collect, manipulate, and analyze digital data and the statistics skills required to select the type of analysis needed to accurately answer questions and gain meaningful insights. “Data Science,” a term that had been around for decades by that point, became the mainstream phrase of choice to describe this confluence of skills.
What Do Data Scientists Do?
In day to day work, data scientists are often responsible for everything that happens to data, from collecting it all the way through analyzing it and reporting on the results. Although every data science job is different, here’s one way to visualize the data science workflow, with some examples of typical tasks a data scientist might perform at each step.
It works like this:
- Capture data. For example: pulling the data from a company database, scraping it from a website, accessing an API, etc.
- Manage data. For example: properly storing the data, and will almost always involve cleaning the data.
- Exploratory Analysis. For example: performing different analyses and visualizing the data in various ways to look for patterns, questions, and opportunities for deeper study.
- Final Analysis. For example: digging deeper into the data to answer specific business questions, and fine-tuning predictive models for the most accurate results.
- Reporting. For example: presenting the results of analysis to management, which might include writing a report, producing visualizations, and making recommendations based on the results of analysis. Reporting might also mean plugging the results of analysis into a data product or dashboard so that other team members or clients can easily access it.
All of that said, what data scientists do from day to day can vary tremendously, in no small part because different companies make use of data science in different ways.
How Businesses Benefit From Hiring Data Scientists
At the highest level, data science is what allows companies to convert data into actual business value.
Consider, for example, a specialty ecommerce retailer. Such a company might get tens of thousands of pageviews every day, and hundreds of orders. With each pageview it can use automated tools to collect a large amount of data about who the visitors are and what actions they’ve taken on the site. And with each order, their sales system can easily collect a variety of data points about actual customers.
This data piles up quickly, however, and it doesn’t have any inherent value on its own. To get value from it, the company will need to analyze it, looking for patterns and insights that suggest future business strategies and tactics. The more actionable, forward-looking insight this data provides, the more value it has to the company.
Because there are many different types of data companies can collect, there are a wide variety of ways data scientists can add value. Here are just a few examples of how data science adds value at businesses across the globe:
- Improving decision-making — data science gives management actionable intelligence that leaders can use to shape short- and long-term strategies.
- Improving hiring — data science can help more objectively evaluate candidates and root out inefficiencies and biases.
- Predicting the future — using machine learning algorithms, data scientists can find patterns in data that humans would not be able to, and forecast future results with a higher level of accuracy.
- Improving targeting — data science can help companies find new target markets, better understand existing customers, and more accurately predict what customers want.
- Identifying new opportunities — by exploring data and looking for patterns, data scientists can identify new business opportunities that might not otherwise be apparent.
- Improving risk assessment — data science often makes it possible to “test” risky ideas by running the numbers before putting them into action, allowing companies to avoid potentially costly risks and mistakes.
- Fostering data-first culture — a data scientist or data science team can help facilitate data-based decision-making in every team across the company by providing them with data tools like dashboards and the training necessary to understand them.
Available Careers in Data Science
Many professionals enter the field as data analysts, a more entry-level role with the lower technical skill threshold, and then move up to the data scientist level once they have a bit of professional experience, although it is possible to get hired directly as a data scientist, too. For more details on various job roles in the data science industry, check out our guide to data science career options.
Salaries vary, but in the US, data analysts make an average of over $65,000 a year (according to Indeed circa May 2019). Data scientists make, on average, more than $120,000 a year. Even more advanced roles like Senior Data Scientist or Machine Learning Engineer can make upwards of $140,000.
And while it might seem like a salary that high would require a huge up-front investment in education, it actually doesn’t! You can learn all the skills you need for data science affordably online.
Debunking Myths About Data Science Careers
You don’t need a background in math or programming. While either of those would certainly be helpful, it’s totally possible to learn data science from scratch, with no programming skills and no mathematics beyond the stuff you learned in high school. Check out our featured student stories and you’ll find plenty of examples of happily employed students who first came to Dataquest with little or no programming or statistics training.
You don’t need to spend a fortune. College degrees cost hundreds of thousands, and even bootcamps typically cost north of $10,000. But at Dataquest, you can learn interactively online for a far more affordable price. 88% of our students say Dataquest is their primary or only source for learning data science, and 96% of students recommend Dataquest for improving your career opportunities.
It won’t take forever. Most of our students meet their learning goals in less than a year (although many stay subscribed for longer to keep their skills sharp and learn new ones as we add new courses), while studying for less than ten hours a week. Becoming a great data scientist takes time and effort, of course, but you might be surprised by just how quickly you can learn all of the fundamental skills required to do great work.
How to Learn Data Science
Interested in getting into data science? There are some things you can do right now to ensure you get started on the right foot.
First, spend some time thinking about why — what’s motivating you to learn data science? Of course, there’s the attractive salary, but try to go deeper and find something about data that interests you. Find the data-based question you want to answer that’s going to push you to keep learning.
Second, read through our data science career guide. It’s long, but it’s based on dozens of interviews with data scientists and data science hiring managers. It will give you a great idea of where the data science industry is right now and what recruiters are looking for as of 2019. Knowing this up front will help you avoid mistakes and save you some time when it does come time to apply for jobs, since you’ll already have a nice portfolio of projects built and ready to go based on the guide’s recommendations.
Third, wherever you learn, make sure you’re learning by doing. At Dataquest, we make sure all of our students apply everything we teach through our interactive, in-browser coding environment, so you’re constantly applying as you learn and getting feedback about whether your code is working correctly. But whether you choose to study on our site or somewhere else, make sure you’re regularly applying what you learn, and spending lots of time actually writing code. It’s easy to watch an hour-long video and feel like you’ve learned something, but you haven’t really learned anything useful unless you apply what you’ve learned yourself.
Fourth, plan on connecting with your peers and the data science community. At Dataquest, we have an online learners community that students can join, but you may also want to check out data science communities on Twitter, Reddit, and other social networks. You can make some great connections and learn a ton from other folks in the community, and most people in data science are very open and generous about sharing their time and expertise.
Ready to dive in? A new and exciting career awaits! You can start learning to code right now in our free introductory courses, and we’ll hook you up with some other great data science resources, too.
Just click the button below to sign up (it's free)!
Free Data Science Resources
Sign up for free to get our weekly newsletter with data science, Python, R, and SQL resource links. Plus, you get access to our free, interactive online courses!
Charlie is a student of data science, and also a content marketer at Dataquest.