Want a Job in Data? Learn SQL.
Why do you need to learn SQL?
1. SQL is used everywhere.
2. It’s in high demand because so many companies use it.
3. Although there are alternatives, SQL is not going anywhere.
SQL is old. There, I said it.
I first heard about SQL in 1997. I was in high school, and as part of a computing class we were working with databases in Microsoft Access. The computers we used were outdated, and the class was boring. Even then, it seemed that SQL was ancient.
SQL dates back almost 50 years to 1970 when Edgar Codd, a computer scientist working for IBM, wrote a paper describing a new system for organizing data in databases. By the end of the decade, several prototypes of Codd’s system had been built, and a query language — the Structured Query Language (SQL) — was born to interact with these databases.
In the years since, it has been widely adopted. Learning SQL — which is pronounced either “sequel” or “S.Q.L.”, by the way — has been a rite of passage for programmers who need to work with databases for decades.
So why should someone who wants to get a job in data spend time learning this ‘ancient’ language? Why not spend all your time mastering Python/R, or focusing on ‘sexier’ data skills, like Deep Learning, Scala, and Spark?
While knowing the fundamentals of a more general-purpose language like Python or R is critical, ignoring SQL will make it much harder to get a job in data. Here are three key reasons why:
1. SQL is everywhere
Almost all of the biggest names in tech use SQL. Uber, Netflix, Airbnb — the list goes on. Even within companies like Facebook, Google, and Amazon, which have built their own high-performance database systems, data teams use SQL to query data and perform analysis.
And it’s not just tech companies: companies big and small use SQL. A quick job search on LinkedIn, for example, will show you that more companies are looking for SQL skills than are looking for Python or R skills. SQL may be old, but it’s ubiquitous.
Data Scientist and former Dataquest student Vicknesh got his first job as a Data Analyst. He quickly found himself using SQL daily: “SQL is so pervasive, it permeates everything here. It’s like the SQL syntax persists through time and space. Everything uses SQL or a derivative of SQL.”
2. SQL is in demand
If you want to get a job in data, your focus should be the skills that employers want. To demonstrate the importance of SQL specifically in data-related jobs, I analyzed 25,000 jobs advertised on Indeed, looking at key skills mentioned in job ads with ‘data’ in the title:
SQL was easily the most mentioned skill, being mentioned in 35.7% of ads– 1.39 times as many ads as Python, and over twice the number of ads as R.
What if you’re looking for your first job in data? Will SQL be required even for entry-level roles? Most entry-level jobs in data are Data Analyst roles, so I took a look at jobs ads with ‘data analyst’ in the title, and those numbers are even more conclusive:
For data analysts, SQL is mentioned in the majority of ads, over three times as often as Python and R. Long story short: yes, you need to learn SQL. It will not only make you more qualified for these jobs, it will set you apart from other candidates who’ve only focused on the “sexy” stuff.
3. SQL isn’t going anywhere
SQL is more popular among data scientists and data engineers than Python or R. The fact that SQL is a language of choice is incredibly important. In the chart below, from StackOverflow’s 2017 developer survey, we can see that SQL eclipses both Python and R in popularity.
Image: Stack Overflow Developer Survey 2017
In StackOverflow’s 2018 survey, the results were the same. I did some brief analysis of the raw survey data, and among devs who work as data analysts or data scientists, SQL was more commonly used than Python, R, or any other language.
Despite lots of hype around NOSQL, Hadoop and other technologies, SQL remains one of the most popular languages not just among folks in the data field but among devs of all stripes. In StackOverflow’s 2019 survey, SQL was the third most-popular language overall, and although the raw data from that survey hasn’t been released yet, it’s likely SQL has retained its crown among data science languages this year, too.
This gives aspiring data practicioners the confidence that they’re not learning a dying language, but instead are learning the lingua franca of data.
So, what’s the best way to learn SQL?
We now understand why we should learn SQL, the obvious question is ‘how?’
There are literally thousands of SQL courses online, but most of them don’t prepare you for using SQL in in the real world. The best way to illustrate this is to look at the queries they teach you to write:
The queries above demonstrate the complexity of the SQL taught at the end of SQL courses by three of the more popular online learning sites. The problem is that real-world SQL doesn’t look like that. Real-world SQL looks like this:
When you’re answering business questions with data, you often write SQL queries that need to combine data from lots of tables, wrangling it into its final form. The end result is students finding themselves unprepared to get the jobs they want, just like this recent post from a data science forum:
What we’re doing about it
Here at Dataquest, we believe that SQL competency is the one of the key skills for anyone who wants to get a job in data. We’re not suggesting you learn SQL instead of Python and/or R, but instead thoroughly learn SQL as your second language — becoming familiar with writing queries at a high level.
We understand that learning SQL is incredibly important for data science, and that’s why we offer a number of interactive SQL courses. In our Data Analyst and Data Scientist paths:
- SQL Fundamentals for Python
- SQL Fundamentals for R
- SQL Intermediate: Table Relations and Joins
- SQL and Databases: Advanced
and in our Data Engineering path:
Our interactive courses are written with goal of equipping our students with the skills they need at the level they’ll need. You won’t spend time watching videos — instead, you’ll be writing your first queries in minutes, and be on your way to mastering the most important data skill.
While we start from zero, our courses go beyond the basics so you can become a SQL master. As an example, the ‘real-life’ SQL image above is taken from our SQL Intermediate course.
You can sign up and complete the first mission in each course for free, and we encourage you to try them out and let us know what you think.
We love SQL
I hope I’ve persuaded you that mastering SQL is key to starting your career in data. While it’s easy to be distracted by the latest and greatest new language or framework, learning SQL will pay dividends on your path to break into the data industry.
It might just be the most important language you learn.
Editor’s Note: Updated May 20, 2019.
Data Scientist at Dataquest.io. Loves Data and Aussie Rules Football. Australian living in Texas.