February 11, 2020

SQL Fundamentals Tutorial: Start Learning SQL Today!

Whatever you’re hoping to do with data, having SQL skills is likely to be key.

That’s because despite the fact that SQL is quite old, some version of it is used in the databases of just about every company on earth. Whether you’re interested in doing data science or simply accessing and analyzing a little bit of your company’s data, chances are SQL fundamentals are going to be required.

What is SQL?

We’ll start this SQL fundamentals tutorial with a very fundamental question: what is SQL, anyway? SQL (often pronounced “sequel”) is an abbreviation for Structured Query Language. It’s a kind of programming language that’s designed specifically for use querying and updating databases.

This is useful because it’s not always practical or advisable to store data in other formats like spreadsheets or CSVs. Imagine, for example, a company with terabytes of data that’s being updated regularly by people in different departments across the company. Trying to manage this with spreadsheets would be a logistical and security nightmare! Databases allow for the secure storage of huge amounts of data, and SQL is the tool we use to access and make changes to that data without needing to store it all in a file on our local machine.

In SQL, we express our requests to the database as “queries”. For example, we might send a query — an instruction — to a database to return a specific subset of data, like a particular table, or to update a particular value in the database.

Let’s imagine we want to get some data on salaries from our company’s database. We certainly don’t want to download all of the company’s data, so instead we’ll write a SQL query that gives us only the specific data we need. If we assume there’s a table in the company database called, salaries, that query would probably look something like this:

SELECT * FROM salaries

The * character here means “everything” — select every column from the salaries table. With this single short query, we’ll be able to get just the table we need from our company database.

But SQL can do a lot more than query a full table. It can help us efficiently merge, filter, update, and analyze data to answer all sorts of questions. So let’s dive in and start learning how to write SQL queries!

Note: for many students, it may be easier learn SQL interactively and write queries in your browser rather than following along with a tutorial. You can do that for free in our interactive SQL Fundamentals course:

SQL Fundamentals Tutorial: Introduction

In this tutorial, we’ll explore data from the American Community Survey on job outcome statistics based on college majors. While the original CSV version can be found on FiveThirtyEight’s Github, we’ll be using a slightly modified version of the data that’s stored as a database. (Remember, SQL is used for querying data stored in databases).

Specifically, we’ll be working with a of the data that contains the 2010-2012 data for recent college grads only. This data is stored as a SQLite database (SQLite is one of the most popular SQL-based database management systems).

If you’re familiar with how to query SQLite databases, you can download the database file here. If you’re not comfortable with getting a SQLite database set up and querying it on your local machine, it may be easier to work through our free SQL fundamentals course, which is interactive and allows you to write real SQL queries of this database using only your browser.

Previewing A Table Using SELECT

Whenever we’re working with a new data set, it’s helpful to start by previewing what we’re actually working with. For this tutorial, we’ll be working with the database file jobs.db, which contains a single table named recent_grads:

Here’s a visual representation of that structure:

SQL Table

Let’s start by displaying only the first five rows of our table. This will give us a good idea of what we’re working with, and limiting the query to five rows will ensure it’s quick and we don’t get overwhelmed with a huge table.

Like other programming languages, code in SQL has to adhere to a defined structure and vocabulary. To specify that we want to return the first 5 rows from recent_grads, we need to run the following SQL query:

SELECT * FROM recent_grads LIMIT 5

In this query, we specify:

  • the columns we wanted using SELECT *
  • the table we wanted to query using FROM recent_grads
  • the number of rows we wanted using LIMIT 5

Here’s a visual breakdown of the different components of the query:

SQL Select Breakdown 2

Now, let’s see what this query returns! Here’s what we’ll get when this query is run:

index Rank Major_code Major Major_category Total Sample_size Men Women ShareWomen Employed Full_time Part_time Full_time_year_round Unemployed Unemployment_rate Median P25th P75th College_jobs Non_college_jobs Low_wage_jobs
0 1 2419 PETROLEUM ENGINEERING Engineering 2339 36 2057 282 0.120564 1976 1849 270 1207 37 0.018381 110000 95000 125000 1534 364 193
1 2 2416 MINING AND MINERAL ENGINEERING Engineering 756 7 679 77 0.101852 640 556 170 388 85 0.117241 75000 55000 90000 350 257 50
2 3 2415 METALLURGICAL ENGINEERING Engineering 856 3 725 131 0.153037 648 558 133 340 16 0.024096 73000 50000 105000 456 176 0
3 4 2417 NAVAL ARCHITECTURE AND MARINE ENGINEERING Engineering 1258 16 1123 135 0.107313 758 1069 150 692 40 0.050125 70000 43000 80000 529 102 0
4 5 2405 CHEMICAL ENGINEERING Engineering 32260 289 21239 11021 0.341631 25694 23170 5180 16697 1672 0.061098 65000 50000 75000 18314 4440 972

As we can see, our query returned exactly what we wanted: every column of our selected table, but only the first five rows. If we changed the end of the query to read LIMIT 10, it would return the first ten rows instead. If we remove the LIMIT command altogether, it would return every row in the table.

A Quick Note About Capitalization

Before we move on, you may have noticed that we’re typing some things, like SELECT, in all capital letters, while other things like the table name are written in lower case.

Depending on the specific database you’re querying and what sort of database management system it uses, capitalization of table names may or may not matter. However, to ensure that your queries will work and be easy to read with any system, it’s best to stick to these conventions:

  • Statements or commands (like SELECT, LIMIT, etc.) should be written in all-caps.
  • Table and column names should be written however they are stored in the database you’re querying (so if the database stores them in all lower case, they should be all lower case in your queries, too).

Now, on to our next statement: WHERE!

Filtering Rows Using WHERE

If we take a look at the details of this data set, we can get a better idea of what data is here and what types of questions we might be able to answer. Spend some time getting familiar with what each column represents.

Based on an understanding of what each column represents, here are some questions we may have:

  • Which majors had mostly female students? Which ones had mostly male students?
  • Which majors had the largest spread (difference) between the 25th and 75th percentile starting salaries?
  • Which engineering majors had the highest full time employment rates?

Let’s start by focusing on the first question. We need to translate this question into SQL so that we can get the answer we’re looking for from the database.

To determine which majors had mostly female students, we want the following subset of data:

  • only the Major column
  • only the rows where ShareWomen is greater than 0.5 (corresponding to 5

To return only the Major column, we need to add the specific column name to the SELECT statement part of the query (instead of using the * operator to return all columns):

SELECT Major FROM recent_grads

This will return all of the values in the Major column. We can specify multiple columns this way as well and the results table will preserve the order of the columns:

SELECT Major, Major_category FROM recent_grads

To return only the values where ShareWomen is greater than or equal to 0.5, we need to add a WHERE clause:

SELECT Major FROM recent_gradsWHERE ShareWomen >= 0.5

Finally, we can limit the number of rows returned using LIMIT:

SELECT Major FROM recent_gradsWHERE ShareWomen >= 0.5 LIMIT 5

If we run that query, here’s what we’ll see:

Major
ACTUARIAL SCIENCE
COMPUTER SCIENCE
ENVIRONMENTAL ENGINEERING
NURSING
INDUSTRIAL PRODUCTION TECHNOLOGIES

Here’s a breakdown of the different components of our query:

SQL WHERE Breakdown

While in the SELECT part of the query, we express the specific column we want. In the WHERE part we express the specific rows we want. The beauty of SQL is that these can be independent.

Let’s write a SQL query that returns the majors where females were a minority. We’ll only return the Major and ShareWomen columns (in that order) and, we won’t limit the number of rows returned:

SELECT Major, ShareWomen FROM recent_grads WHERE ShareWomen < 0.5
Major ShareWomen
PETROLEUM ENGINEERING 0.120564
MINING AND MINERAL ENGINEERING 0.101852
METALLURGICAL ENGINEERING 0.153037
NAVAL ARCHITECTURE AND MARINE ENGINEERING 0.107313
CHEMICAL ENGINEERING 0.341631
NUCLEAR ENGINEERING 0.144967
ASTRONOMY AND ASTROPHYSICS 0.441356
MECHANICAL ENGINEERING 0.139793
ELECTRICAL ENGINEERING 0.437847
COMPUTER ENGINEERING 0.199413
AEROSPACE ENGINEERING 0.196450
BIOMEDICAL ENGINEERING 0.119559
MATERIALS SCIENCE 0.310820
ENGINEERING MECHANICS PHYSICS AND SCIENCE 0.183985
BIOLOGICAL ENGINEERING 0.320784
INDUSTRIAL AND MANUFACTURING ENGINEERING 0.343473
GENERAL ENGINEERING 0.252960
ARCHITECTURAL ENGINEERING 0.350442
COURT REPORTING 0.236063
FOOD SCIENCE 0.222695
ELECTRICAL ENGINEERING TECHNOLOGY 0.325092
MATERIALS ENGINEERING AND MATERIALS SCIENCE 0.292607
MANAGEMENT INFORMATION SYSTEMS AND STATISTICS 0.278790
CIVIL ENGINEERING 0.227118
CONSTRUCTION SERVICES 0.342229
OPERATIONS LOGISTICS AND E-COMMERCE 0.322222
MISCELLANEOUS ENGINEERING 0.189970
PUBLIC POLICY 0.251389
ENGINEERING TECHNOLOGIES 0.090713
MISCELLANEOUS FINE ARTS 0.410180
GEOLOGICAL AND GEOPHYSICAL ENGINEERING 0.324838
FINANCE 0.355469
ECONOMICS 0.340825
BUSINESS ECONOMICS 0.249190
NUCLEAR, INDUSTRIAL RADIOLOGY, AND BIOLOGICAL … 0.430537
ACCOUNTING 0.253583
MATHEMATICS 0.244103
PHYSICS 0.448099
MEDICAL TECHNOLOGIES TECHNICIANS 0.434298
STATISTICS AND DECISION SCIENCE 0.281936
ENGINEERING AND INDUSTRIAL MANAGEMENT 0.174123
MEDICAL ASSISTING SERVICES 0.178982
COMPUTER PROGRAMMING AND DATA PROCESSING 0.269194
GENERAL BUSINESS 0.417925
ARCHITECTURE 0.321770
INTERNATIONAL BUSINESS 0.282903
PHARMACY PHARMACEUTICAL SCIENCES AND ADMINISTR… 0.451465
MOLECULAR BIOLOGY 0.077453
MISCELLANEOUS BUSINESS & MEDICAL ADMINISTRATION 0.200023
MISCELLANEOUS ENGINEERING TECHNOLOGIES 0.000000
MECHANICAL ENGINEERING RELATED TECHNOLOGIES 0.377437
INDUSTRIAL AND ORGANIZATIONAL PSYCHOLOGY 0.436302
PHYSICAL SCIENCES 0.426924
MILITARY TECHNOLOGIES 0.429685
ELECTRICAL, MECHANICAL, AND PRECISION TECHNOLO… 0.232444
MARKETING AND MARKETING RESEARCH 0.382900
POLITICAL SCIENCE AND GOVERNMENT 0.485930
GEOGRAPHY 0.473190
COMPUTER ADMINISTRATION MANAGEMENT AND SECURITY 0.180883
COMPUTER NETWORKING AND TELECOMMUNICATIONS 0.305005
GEOLOGY AND EARTH SCIENCE 0.470197
PUBLIC ADMINISTRATION 0.476461
COMMUNICATIONS 0.305109
CRIMINAL JUSTICE AND FIRE PROTECTION 0.125035
COMMERCIAL ART AND GRAPHIC DESIGN 0.374356
SPECIAL NEEDS EDUCATION 0.366177
TRANSPORTATION SCIENCES AND TECHNOLOGIES 0.321296
NEUROSCIENCE 0.475010
MULTI/INTERDISCIPLINARY STUDIES 0.495397
ATMOSPHERIC SCIENCES AND METEOROLOGY 0.124950
EDUCATIONAL ADMINISTRATION AND SUPERVISION 0.448732
PHILOSOPHY AND RELIGIOUS STUDIES 0.416810
ENGLISH LANGUAGE AND LITERATURE 0.339671
SCIENCE AND COMPUTER TEACHER EDUCATION 0.423209
MUSIC 0.444582
COSMETOLOGY SERVICES AND CULINARY ARTS 0.383719

Expressing Multiple Filter Criteria Using AND

To filter rows by specific criteria, we need to use the WHERE statement. A simple WHERE statement requires three things:

  • The column we want the database to filter on: ShareWomen
  • A comparison operator that specifies how we want to compare a value in a column: >
  • The value we want the database to compare each value to: 0.5

Here are the comparison operators we can use:

  • Less than: <
  • Less than or equal to: <=
  • Greater than: >
  • Greater than or equal to: >=
  • Equal to: =
  • Not equal to: !=

The comparison value after the operator must be either text or a number, depending on the field. Because ShareWomen is a numeric column, we don’t need to enclose the number 0.5 in quotes.

Finally, most database systems require that the SELECT and FROM statements come first, before WHERE or any other statements. We can use the AND operator to combine multiple filter criteria. For example, to determine which engineering majors had majority female, we’d need to specify 2 filtering criteria.

SELECT Major FROM recent_gradsWHERE Major_category = 'Engineering' AND ShareWomen > 0.5
Major
ENVIRONMENTAL ENGINEERING
INDUSTRIAL PRODUCTION TECHNOLOGIES

It looks like only 2 majors met this criteria. If we wanted to “zoom” back out to look at all of the columns for both of these majors to see if they shared some other common attributes, we can modify the SELECT statement and use the symbol * to represent all columns

SELECT * FROM recent_gradsWHERE Major_category = 'Engineering' AND ShareWomen > 0.5
index Rank Major_code Major Major_category Total Sample_size Men Women ShareWomen Employed Full_time Part_time Full_time_year_round Unemployed Unemployment_rate Median P25th P75th College_jobs Non_college_jobs Low_wage_jobs
30 31 2410 ENVIRONMENTAL ENGINEERING Engineering 4047 26 2639 3339 0.558548 2983 2384 930 1951 308 0.093589 50000 42000 56000 2028 830 260
38 39 2503 INDUSTRIAL PRODUCTION TECHNOLOGIES Engineering 4631 73 528 1588 0.750473 4428 3988 597 3242 129 0.028308 46000 35000 65000 1394 2454 480

The ability to quickly iterate on queries as you think of new questions is one of the most powerful aspects of SQL. The SQL workflow lets data professionals focus on asking and answering questions, instead of worrying about lower-level programming concepts.

Now, let’s write a SQL query that returns all majors with a majority of female students and that had a median salary greater than 50000. Let’s also limit the columns displayed in the results like so:

SELECT Major, Major_category, Median, ShareWomen FROM recent_grads WHERE ShareWomen > 0.5 AND Median > 50000
Major Major_category Median ShareWomen
ACTUARIAL SCIENCE Business 62000 0.535714
COMPUTER SCIENCE Computers & Mathematics 53000 0.578766

It looks like only two majors with mostly female students also had a median salary higher than $50,000.

Returning One of Several Conditions With OR

We used the AND operator to specify that our filter needs to fit two Boolean conditions. Both of the conditions had to evaluate to True for the record to appear in the results.

If we wanted to specify a filter that meets either of the conditions instead, we could use the OR operator.

SELECT [column1, column2,...] FROM [table1]WHERE [condition1] OR [condition2]

Let’s write a SQL query that returns the first 20 majors that either have a Median salary greater than or equal to 10,000, or have less than or equal to 1,000 Unemployed people. Let’s only include the following columns in the results and in this order:

  • Major
  • Median
  • Unemployed
SELECT Major, Median, Unemployed FROM recent_grads WHERE Median >= 10000 OR Unemployed <= 1000 LIMIT 20
Major Median Unemployed
PETROLEUM ENGINEERING 110000 37
MINING AND MINERAL ENGINEERING 75000 85
METALLURGICAL ENGINEERING 73000 16
NAVAL ARCHITECTURE AND MARINE ENGINEERING 70000 40
CHEMICAL ENGINEERING 65000 1672
NUCLEAR ENGINEERING 65000 400
ACTUARIAL SCIENCE 62000 308
ASTRONOMY AND ASTROPHYSICS 62000 33
MECHANICAL ENGINEERING 60000 4650
ELECTRICAL ENGINEERING 60000 3895
COMPUTER ENGINEERING 60000 2275
AEROSPACE ENGINEERING 60000 794
BIOMEDICAL ENGINEERING 60000 1019
MATERIALS SCIENCE 60000 78
ENGINEERING MECHANICS PHYSICS AND SCIENCE 58000 23
BIOLOGICAL ENGINEERING 57100 589
INDUSTRIAL AND MANUFACTURING ENGINEERING 57000 699
GENERAL ENGINEERING 56000 2859
ARCHITECTURAL ENGINEERING 54000 170
COURT REPORTING 54000 11

Now, let’s start getting even more advanced with our SQL queries.

Grouping Operators With Parentheses

There’s a certain class of questions that we can’t answer using only the techniques we’ve learned so far. For example, if we wanted to write a query that returned all Engineering majors that either had mostly female graduates or an unemployment rate below 5.

The three raw conditions we’ll need are:

Major_category = 'Engineering'ShareWomen >= 0.5Unemployment_rate < 0.051

What the SQL query looks like using parantheses:

SELECT Major, Major_category, ShareWomen, Unemployment_rate FROM recent_grads
WHERE (Major_category = 'Engineering') AND (ShareWomen > 0.5 OR Unemployment_rate < 0.051);

Notice that we enclosed the logic we wanted to be evaluated together in parentheses. This is very similar to how we group mathematical calculations together using parentheses to define which are evaluated first. The parentheses makes it explictly clear to the database that we want all of the rows where both of the expressions in the statements evaluate to True:

(Major_category = 'Engineering' AND ShareWomen > 0.5) -> True or False(ShareWomen > 0.5 OR Unemployment_rate < 0.051) -> True or False

If we had written the where statement without any parentheses, the database would guess what our intentions are, and actually execute the following query instead:

WHERE (Major_category = 'Engineering' AND ShareWomen > 0.5) OR (Unemployment_rate < 0.051)

Leaving the parentheses out implies that we want the calculation to happen from left to right, in the order in which the logic is written, so it wouldn’t return us the data we want.

Now let’s run our intended query and see the results! We’ll run the query we explored above, which returns all Engineering majors that either had mostly women graduates or had an unemployment rate below 5.

Let’s only include the following columns in the results and in this order:

  • Major
  • Major_category
  • ShareWomen
  • Unemployment_rate

Here’s our query:

SELECT Major, Major_category, ShareWomen, Unemployment_rateFROM recent_gradsWHERE (Major_category = 'Engineering') AND (ShareWomen > 0.5 OR Unemployment_rate < 0.051)
Major Major_category ShareWomen Unemployment_rate
PETROLEUM ENGINEERING Engineering 0.120564 0.018381
METALLURGICAL ENGINEERING Engineering 0.153037 0.024096
NAVAL ARCHITECTURE AND MARINE ENGINEERING Engineering 0.107313 0.050125
MATERIALS SCIENCE Engineering 0.310820 0.023043
ENGINEERING MECHANICS PHYSICS AND SCIENCE Engineering 0.183985 0.006334
INDUSTRIAL AND MANUFACTURING ENGINEERING Engineering 0.343473 0.042876
MATERIALS ENGINEERING AND MATERIALS SCIENCE Engineering 0.292607 0.027789
ENVIRONMENTAL ENGINEERING Engineering 0.558548 0.093589
INDUSTRIAL PRODUCTION TECHNOLOGIES Engineering 0.750473 0.028308
ENGINEERING AND INDUSTRIAL MANAGEMENT Engineering 0.174123 0.033652

That looks great, but what if we want to sort the results based on the values of a specific column to make them more readable?

Ordering Results Using ORDER BY

The results of every query we’ve written so far have been ordered by the Rank column. Recall the query from earlier in this post that returned all of the columns and you’ll see the Rank column:

SELECT * FROM recent_grads LIMIT 5
index Rank Major_code Major Major_category Total Sample_size Men Women ShareWomen Employed Full_time Part_time Full_time_year_round Unemployed Unemployment_rate Median P25th P75th College_jobs Non_college_jobs Low_wage_jobs
0 1 2419 PETROLEUM ENGINEERING Engineering 2339 36 2057 282 0.120564 1976 1849 270 1207 37 0.018381 110000 95000 125000 1534 364 193
1 2 2416 MINING AND MINERAL ENGINEERING Engineering 756 7 679 77 0.101852 640 556 170 388 85 0.117241 75000 55000 90000 350 257 50
2 3 2415 METALLURGICAL ENGINEERING Engineering 856 3 725 131 0.153037 648 558 133 340 16 0.024096 73000 50000 105000 456 176 0
3 4 2417 NAVAL ARCHITECTURE AND MARINE ENGINEERING Engineering 1258 16 1123 135 0.107313 758 1069 150 692 40 0.050125 70000 43000 80000 529 102 0
4 5 2405 CHEMICAL ENGINEERING Engineering 32260 289 21239 11021 0.341631 25694 23170 5180 16697 1672 0.061098 65000 50000 75000 18314 4440 972

As the questions we want to answer get more complex, we want more control over how the results are ordered. We can specify the order using the ORDER BY clause.

For example, we may want to understand which majors that met the criteria in the WHERE statement had the lowest unemployment rate. The following query will return the results in ascending order by the Unemployment_rate column.

SELECT Rank, Major, Major_category, ShareWomen, Unemployment_rate
FROM recent_grads
WHERE (Major_category = 'Engineering') AND (ShareWomen > 0.5 OR Unemployment_rate < 0.051)
ORDER BY Unemployment_rate
Rank Major Major_category ShareWomen Unemployment_rate
15 ENGINEERING MECHANICS PHYSICS AND SCIENCE Engineering 0.183985 0.006334
1 PETROLEUM ENGINEERING Engineering 0.120564 0.018381
14 MATERIALS SCIENCE Engineering 0.310820 0.023043
3 METALLURGICAL ENGINEERING Engineering 0.153037 0.024096
24 MATERIALS ENGINEERING AND MATERIALS SCIENCE Engineering 0.292607 0.027789
39 INDUSTRIAL PRODUCTION TECHNOLOGIES Engineering 0.750473 0.028308
51 ENGINEERING AND INDUSTRIAL MANAGEMENT Engineering 0.174123 0.033652
17 INDUSTRIAL AND MANUFACTURING ENGINEERING Engineering 0.343473 0.042876
4 NAVAL ARCHITECTURE AND MARINE ENGINEERING Engineering 0.107313 0.050125
31 ENVIRONMENTAL ENGINEERING Engineering 0.558548 0.093589

If we instead want the results ordered by the same column but in descending order, we can add the DESC keyword:

SELECT Rank, Major, Major_category, ShareWomen, Unemployment_rate
FROM recent_gradsWHERE (Major_category = 'Engineering') AND (ShareWomen > 0.5 OR Unemployment_rate < 0.051)
ORDER BY Unemployment_rate DESC
Rank Major Major_category ShareWomen Unemployment_rate
31 ENVIRONMENTAL ENGINEERING Engineering 0.558548 0.093589
4 NAVAL ARCHITECTURE AND MARINE ENGINEERING Engineering 0.107313 0.050125
17 INDUSTRIAL AND MANUFACTURING ENGINEERING Engineering 0.343473 0.042876
51 ENGINEERING AND INDUSTRIAL MANAGEMENT Engineering 0.174123 0.033652
39 INDUSTRIAL PRODUCTION TECHNOLOGIES Engineering 0.750473 0.028308
24 MATERIALS ENGINEERING AND MATERIALS SCIENCE Engineering 0.292607 0.027789
3 METALLURGICAL ENGINEERING Engineering 0.153037 0.024096
14 MATERIALS SCIENCE Engineering 0.310820 0.023043
1 PETROLEUM ENGINEERING Engineering 0.120564 0.018381
15 ENGINEERING MECHANICS PHYSICS AND SCIENCE Engineering 0.183985 0.006334

Let’s write a query that returns all majors where ShareWomen is greater than 0.3 and Unemployment_rate is less than .1. Let’s only include the following columns in the results and in this order:

  • Major,
  • ShareWomen,
  • Unemployment_rate

We’ll order the results in descending order by the ShareWomen column. This will allow us to easily compare the unemployment rates for majors with higher or lower shares of women.

SELECT Major, ShareWomen, Unemployment_rate FROM recent_grads
WHERE ShareWomen > 0.3 AND Unemployment_rate < .1O
RDER BY ShareWomen DESC

And here are our results!

Major ShareWomen Unemployment_rate
EARLY CHILDHOOD EDUCATION 0.967998 0.040105
MATHEMATICS AND COMPUTER SCIENCE 0.927807 0.000000
ELEMENTARY EDUCATION 0.923745 0.046586
ANIMAL SCIENCES 0.910933 0.050862
PHYSIOLOGY 0.906677 0.069163
MISCELLANEOUS PSYCHOLOGY 0.905590 0.051908
HUMAN SERVICES AND COMMUNITY ORGANIZATION 0.904075 0.037819
NURSING 0.896019 0.044863
GEOSCIENCES 0.881294 0.024374
MASS MEDIA 0.877228 0.089837
COGNITIVE SCIENCE AND BIOPSYCHOLOGY 0.854523 0.075236
ART HISTORY AND CRITICISM 0.845934 0.060298
EDUCATIONAL PSYCHOLOGY 0.817099 0.065112
GENERAL EDUCATION 0.812877 0.057360
SOCIAL WORK 0.810704 0.068828
TEACHER EDUCATION: MULTIPLE LEVELS 0.798920 0.036546
COUNSELING PSYCHOLOGY 0.798746 0.053621
MATHEMATICS TEACHER EDUCATION 0.792095 0.016203
PSYCHOLOGY 0.779933 0.083811
GENERAL MEDICAL AND HEALTH SERVICES 0.774577 0.082102
HEALTH AND MEDICAL ADMINISTRATIVE SERVICES 0.770901 0.089626
SOIL SCIENCE 0.764427 0.000000
AREA ETHNIC AND CIVILIZATION STUDIES 0.758060 0.063429
APPLIED MATHEMATICS 0.753927 0.090823
FAMILY AND CONSUMER SCIENCES 0.752144 0.067128
INDUSTRIAL PRODUCTION TECHNOLOGIES 0.750473 0.028308
SOCIAL PSYCHOLOGY 0.747561 0.029650
HUMANITIES 0.745662 0.068584
HOSPITALITY MANAGEMENT 0.733992 0.061169
SOCIAL SCIENCE OR HISTORY TEACHER EDUCATION 0.733968 0.054083
THEOLOGY AND RELIGIOUS VOCATIONS 0.728495 0.062628
FRENCH GERMAN LATIN AND OTHER COMMON FOREIGN L… 0.728033 0.075566
INTERDISCIPLINARY SOCIAL SCIENCES 0.721866 0.092306
MISCELLANEOUS AGRICULTURE 0.719974 0.059767
JOURNALISM 0.719859 0.069176
MISCELLANEOUS EDUCATION 0.718365 0.059212
COMPUTER AND INFORMATION SYSTEMS 0.707719 0.093460
COMMUNICATION DISORDERS SCIENCES AND SERVICES 0.707136 0.047584
MISCELLANEOUS HEALTH MEDICAL PROFESSIONS 0.702020 0.081411
LIBERAL ARTS 0.700898 0.078268
FORESTRY 0.690365 0.096726
OCEANOGRAPHY 0.688999 0.056995
ART AND MUSIC EDUCATION 0.686024 0.038638
PHYSICAL FITNESS PARKS RECREATION AND LEISURE 0.683943 0.051467
ADVERTISING AND PUBLIC RELATIONS 0.673143 0.067961
HUMAN RESOURCES AND PERSONNEL MANAGEMENT 0.672161 0.059570
MULTI-DISCIPLINARY OR GENERAL SCIENCE 0.669999 0.055807
FINE ARTS 0.667034 0.084186
COMPOSITION AND RHETORIC 0.666119 0.081742
HISTORY 0.651741 0.095667
ECOLOGY 0.651660 0.054475
GENETICS 0.643331 0.034118
TREATMENT THERAPY PROFESSIONS 0.640000 0.059821
NUTRITION SCIENCES 0.638147 0.068701
ZOOLOGY 0.637293 0.046320
INTERNATIONAL RELATIONS 0.632987 0.096799
UNITED STATES HISTORY 0.630716 0.047179
DRAMA AND THEATER ARTS 0.629505 0.077541
CRIMINOLOGY 0.618223 0.097244
MICROBIOLOGY 0.615727 0.066776
PLANT SCIENCE AND AGRONOMY 0.606889 0.045455
BIOLOGY 0.601858 0.070725
SECONDARY TEACHER EDUCATION 0.601752 0.052229
AGRICULTURE PRODUCTION AND MANAGEMENT 0.594208 0.050031
PRE-LAW AND LEGAL STUDIES 0.591001 0.071965
AGRICULTURAL ECONOMICS 0.589712 0.077250
STUDIO ARTS 0.584776 0.089552
ENVIRONMENTAL SCIENCE 0.584556 0.078585
BUSINESS MANAGEMENT AND ADMINISTRATION 0.580948 0.072218
COMPUTER SCIENCE 0.578766 0.063173
LANGUAGE AND DRAMA EDUCATION 0.576360 0.050306
MISCELLANEOUS BIOLOGY 0.566641 0.058545
NATURAL RESOURCES MANAGEMENT 0.564639 0.066619
ENVIRONMENTAL ENGINEERING 0.558548 0.093589
HEALTH AND MEDICAL PREPARATORY PROGRAMS 0.556604 0.069780
MISCELLANEOUS SOCIAL SCIENCES 0.543405 0.073080
ACTUARIAL SCIENCE 0.535714 0.095652
SOCIOLOGY 0.532334 0.084951
BOTANY 0.528969 0.000000
INFORMATION SCIENCES 0.526476 0.060741
PHARMACOLOGY 0.524153 0.085532
GENERAL AGRICULTURE 0.515543 0.019642
BIOCHEMICAL SCIENCES 0.515406 0.080531
INTERCULTURAL AND INTERNATIONAL STUDIES 0.507377 0.083634
PHYSICAL AND HEALTH EDUCATION TEACHING 0.506721 0.074667
CHEMISTRY 0.505141 0.053972
MULTI/INTERDISCIPLINARY STUDIES 0.495397 0.070861
NEUROSCIENCE 0.475010 0.048482
GEOLOGY AND EARTH SCIENCE 0.470197 0.075449
PHARMACY PHARMACEUTICAL SCIENCES AND ADMINISTR… 0.451465 0.055521
EDUCATIONAL ADMINISTRATION AND SUPERVISION 0.448732 0.000000
PHYSICS 0.448099 0.048224
MUSIC 0.444582 0.075960
ASTRONOMY AND ASTROPHYSICS 0.441356 0.021167
ELECTRICAL ENGINEERING 0.437847 0.059174
MEDICAL TECHNOLOGIES TECHNICIANS 0.434298 0.036983
NUCLEAR, INDUSTRIAL RADIOLOGY, AND BIOLOGICAL … 0.430537 0.071540
PHYSICAL SCIENCES 0.426924 0.035354
SCIENCE AND COMPUTER TEACHER EDUCATION 0.423209 0.047264
GENERAL BUSINESS 0.417925 0.072861
PHILOSOPHY AND RELIGIOUS STUDIES 0.416810 0.096052
MISCELLANEOUS FINE ARTS 0.410180 0.089375
COSMETOLOGY SERVICES AND CULINARY ARTS 0.383719 0.055677
MARKETING AND MARKETING RESEARCH 0.382900 0.061215
MECHANICAL ENGINEERING RELATED TECHNOLOGIES 0.377437 0.056357
COMMERCIAL ART AND GRAPHIC DESIGN 0.374356 0.096798
SPECIAL NEEDS EDUCATION 0.366177 0.041508
FINANCE 0.355469 0.060686
ARCHITECTURAL ENGINEERING 0.350442 0.061931
INDUSTRIAL AND MANUFACTURING ENGINEERING 0.343473 0.042876
CONSTRUCTION SERVICES 0.342229 0.060023
CHEMICAL ENGINEERING 0.341631 0.061098
ECONOMICS 0.340825 0.099092
ENGLISH LANGUAGE AND LITERATURE 0.339671 0.087724
ELECTRICAL ENGINEERING TECHNOLOGY 0.325092 0.087557
GEOLOGICAL AND GEOPHYSICAL ENGINEERING 0.324838 0.075038
OPERATIONS LOGISTICS AND E-COMMERCE 0.322222 0.047859
TRANSPORTATION SCIENCES AND TECHNOLOGIES 0.321296 0.072725
BIOLOGICAL ENGINEERING 0.320784 0.087143
MATERIALS SCIENCE 0.310820 0.023043
COMMUNICATIONS 0.305109 0.075177

Master the Fundamentals of SQL

As you can see, SQL is a powerful language for accessing data, and this post is really just scratching the surface. If you’d like to learn more, we encourage you to sign up for free and start our interactive SQL Fundamentals course, which this blog post is based on. The course goes even deeper, though, and will teach you how to use SQL to do things like:

  • Calculate summary statistics
  • Further segment data using grouping
  • Write more complex queries using subqueries

In the course, you’ll continue to work with salary data from recent college graduates as well as data from the CIA World Factbook.

Celeste Grupman

About the author

Celeste Grupman

Celeste is the Director of Operations at Dataquest. She is passionate about creating affordable access to high-quality skills training for students across the globe.

Learn data skills for free

Headshot Headshot

Join 1M+ learners

Try free courses