Why do you need to learn SQL?
1. SQL is used everywhere.
2. It’s in high demand because so many companies use it.
3. SQL is still the most popular language for data work in 2021.
Yes! While the COVID-19 pandemic has dramatically increased the number of data professionals working from home, it hasn't caused companies to change the way they store their data — which is mostly using SQL-based database systems.
SQL is old. There, I said it.
I first heard about SQL in 1997. I was in high school, and as part of a computing class we were working with databases in Microsoft Access. The computers we used were outdated, and the class was boring. Even then, it seemed that SQL was ancient.
SQL dates back almost 50 years to 1970 when Edgar Codd, a computer scientist working for IBM, wrote a paper describing a new system for organizing data in databases. By the end of the decade, several prototypes of Codd’s system had been built, and a query language — the Structured Query Language (SQL) — was born to interact with these databases.
In the years since, it has been widely adopted. Learning SQL — which can be pronounced either “sequel” or “S.Q.L.”, by the way — has been a rite of passage for programmers who need to work with databases for decades.
But why should someone who wants to get a job in data spend time learning this ‘ancient’ language in 2021?
Why not spend all your time mastering Python/R, or focusing on ‘sexier’ data skills, like Deep Learning, Scala, and Spark?
While knowing the fundamentals of a more general-purpose language like Python or R is critical, ignoring SQL will make it much harder to get a job in data. Here are three key reasons why:
1. SQL is everywhere
Almost all of the biggest names in tech use SQL. Uber, Netflix, Airbnb — the list goes on. Even within companies like Facebook, Google, and Amazon, which have built their own high-performance database systems, data teams use SQL to query data and perform analysis.
And it’s not just tech companies: companies big and small use SQL. A quick job search on LinkedIn, for example, will show you that more companies are looking for SQL skills than are looking for Python or R skills. SQL may be old, but it’s ubiquitous.
Data Scientist and former Dataquest student Vicknesh got his first job as a Data Analyst. He quickly found himself using SQL daily: “SQL is so pervasive, it permeates everything here. It’s like the SQL syntax persists through time and space. Everything uses SQL or a derivative of SQL.”
2. SQL is in demand
If you want to get a job in data, your focus should be the skills that employers want.
To demonstrate the importance of SQL specifically in data-related jobs, in early 2021 I analyzed more than 32,000 data jobs advertised on Indeed, looking at key skills mentioned in job ads with ‘data’ in the title.
As we can see, SQL is the most in-demand skill among all jobs in data, appearing in 42.7% of all job postings.
Interestingly, the proportion of data jobs listing SQL actually seems to be increasing! When I performed this same analysis in 2017, SQL was also the most in-demand skill, but it was listed in 35.7% of ads.
If you're looking for your first job in data, it turns out knowing SQL is even more critical.
Most entry-level jobs in data are Data Analyst roles, so I took a look at jobs ads with ‘data analyst’ in the title, and those numbers are even more conclusive:
For data analyst roles, SQL is again the most in-demand skill, listed in 57.4% of all data analyst jobs. SQL appears in 1.5 times as many "data analyst" job postings as Python, and nearly 2.5 times as many job postings as R.
There's no doubt that if you're looking for a role as a data analyst, learning SQL should be at the top of your to-do list.
In fact, even if you're interested in more advanced roles, SQL skills are critical.
I performed the same analysis on "Data Scientist" and "Data Engineer" job postings, and while SQL isn't the top skill for either of those jobs, it's still listed in 58.2% of data scientist job postings, and 56.4% of data engineer job postings.
That means that even if you're a Python master already, you're going to miss out on 3 out of 5 data science and data engineer job openings unless you've got SQL skills on your resume, too.
It will not only make you more qualified for these jobs, it will set you apart from other candidates who’ve only focused on the “sexy” stuff like machine learning in Python.
3. SQL is still the top language for data work
SQL is more popular among data scientists and data engineers than even Python or R. In fact, it's one of the most-used languages in the entire tech industry!
In the chart below, the "most used" technologies from StackOverflow’s 2020 developer survey, we can see that SQL eclipses even Python in terms of popularity. In fact, it's the third-most-popular language among all developers:
But we're concerned specifically with jobs within the field of data science, so let's filter things down a little further. If we dig into the raw data from the 2020 survey, we find that SQL is even more imptorant and widely-used in the context of data jobs.
In the complete dataset, which StackOverflow has released here, we can see that among developers who work with data (including data scentists, data analysts, database adminstrators, data engineers, etc.), more than 70% use SQL — more than any other language.
And if we filter down still further, into just data scientists and analysts, we can see that SQL is still the most popular technology. 65% of data scientists and data analysts said they used SQL, compared to 64% for Python, and 28% for R.
In other words: SQL is the most-used language in data science, according to the 10,000+ data professionals who responded to StackOverflow's 2020 survey.
Despite lots of hype around NOSQL, Hadoop and other technologies, SQL remains the most popular language for data work, and one of the most popular languages for developers of all stripes.
So, what’s the best way to learn SQL?
Now that we why we should learn SQL, the obvious question is how?
There are literally thousands of SQL courses online, but most of them don’t prepare you for using SQL in in the real world. The best way to illustrate this is to look at the queries they teach you to write:
The queries above demonstrate the complexity of the SQL taught at the end of SQL courses by three of the more popular online learning sites. The problem is that real-world SQL doesn’t look like that. Real-world SQL looks like this:
When you’re answering business questions with data, you often write SQL queries that need to combine data from lots of tables, wrangling it into its final form.
The end result is students finding themselves unprepared to get the jobs they want, just like this recent post from a data science forum:
What we’re doing about it
Here at Dataquest, we believe that SQL competency is the one of the key skills for anyone who wants to get a job in data.
We’re not suggesting you learn SQL instead of Python and/or R, but instead thoroughly learn SQL as your second language — becoming familiar with writing queries at a high level.
We understand that learning SQL is incredibly important for data science, and that’s why we offer a number of interactive SQL courses. In our Data Analyst and Data Scientist paths:
Our Data Engineering path also includes a couple of unique courses:
We've also put together a downloadable SQL Cheat Sheet as a useful reference for the SQL basics.
Our interactive courses are written with goal of equipping our students with the skills they need at the level they’ll need. You won’t spend time watching videos — instead, you’ll be writing your first queries in minutes, and be on your way to mastering the most important data skill.
While we start from zero, our courses go beyond the basics so you can become a SQL master. As an example, the ‘real-life’ SQL image above is taken from our SQL Intermediate course.
You can sign up and complete the first mission in each course for free, and we encourage you to try them out and let us know what you think.
Learn SQL the right way!
Why passively watch video lectures when you can learn by doing?
We Love SQL!
I hope I’ve persuaded you that mastering SQL is key to starting your career in data. While it’s easy to be distracted by the latest and greatest new language or framework, learning SQL will pay dividends on your path to break into the data industry.
It might just be the most important language you learn.
Data Scientist at Dataquest.io. Loves Data and Aussie Rules Football. Australian living in Texas.