Top 20 SQL JOINs Interview Questions and Answers (2023)
Data is very valuable to organizations. Actionable insights that give an organization competitive advantage and help it run more efficiently can be extracted from the organization's data. Therefore, data must be collected and stored.
Databases are an organized way to store and query data. There are two main types of databases: relational and non-relational.
Relational databases are very popular because they're structured and data is organized in tables. The data in the various tables may have one-to-one, one-to-many, or many-to-many relationships. The way the data is organized in a relational database is defined in its schema.
Non-relational databases store data using simple key-value pairs in non-tabular form. Although relational databases support Binary Large Objects (BLOB) for storing semi-structured and unstructed data, storing and retrieving these types of data is easier with non-relational databases.
The Structured Query Language (SQL) is the language for accessing and interacting with relational databases. If you want to add, delete, edit, or query information on a relational database, the easiest way to do it is through SQL.
Relational databases dominate the database market and they are projected to grow by more than 30 percent between 2021 and 2026. SQL is the most in-demand skill for data related jobs. As more and more organizations embrace the use of relational databases, the demand for data professionals with SQL skills will continue to grow.
More often than not, you may need to combine two or more tables in a relational database to get the data needed for performing your analysis. You can combine tables in SQL with JOIN clauses. There are several SQL JOIN clauses; understanding how each of them works can be challenging.
SQL JOIN questions are, therefore, interviewers' favorites. They tend to show up in most SQL interviews. In this tutorial, we'll take you step-by-step through the answers to the top 20 SQL JOIN interview questions — and equip you with the knowledge to ace your upcoming SQL interviews. We'll be writing a lot of SQL queries. This SQL Cheat Sheet will help you get up to speed if your SQL skills are a little rusty.
SQL JOINs Interview Questions
In a SQL interview, you'll most likely be asked questions that require you to combine tables. SQL JOIN clauses are used to combine data from two or more tables in a relational database.
SQL JOIN clauses are often used when the tables share some sort of relationship. The most common conditional expression to join tables is the equality conditional (equi-join). However, you can join tables that don't share any relationships and use other conditional expressions besides equality.
Here are some commoly asked SQL JOIN interview questions and their answers. We recommend that you draw your own diagrams and repeatedly practice these questions to master how SQL JOINs work.
1. What are SQL JOINs?
Answer
SQL JOIN clauses are used to combine rows from two or more tables. Usually, tables are joined based on a related column between them, but you can also join tables that have no relationships. The term "SQL JOIN" is used to refer to a SQL query with the JOIN
keyword. This is known as an explicit join. However, you can also join tables in SQL without using the JOIN
keyword. This is known as an implicit join.
2. What are the general explicit and implicit join notation syntaxes?
Answer
EXPLICIT JOIN:
SELECT *
FROM [left table]
[JOIN CLAUSE] [right table]
ON conditional expression;
IMPLICIT JOIN:
SELECT *
FROM [left table], [right table]
WHERE conditional expression;
The conditional expression is not required for some types of SQL JOINs.
3. What are the different types of JOINs in SQL?
Answer
The main types of SQL JOINs are:
- CROSS JOIN: matches every row of the left table with every row of the right table and returns their Cartesian product. It's also known as Cartesian JOIN.
- [INNER] JOIN: returns rows that have matching values on both the right and left tables, based on the specified conditional expression.
- NATURAL JOIN: a type of equi-join that combines the left and right tables using common column names.
- LEFT [OUTER] JOIN: returns all the rows from the left table with the matching rows from the right table, based on the specified conditional expression.
- RIGHT [OUTER] JOIN: returns all the rows from the right table with the matching rows from the left table, based on the specified conditional expression.
- FULL [OUTER] JOIN: returns all rows from the left and the right tables, regardless of whether the conditional expression is matched.
- SELF JOIN: allows you to join a table to itself as if the tables were two different tables.
It's worth noting that the SQL syntax placed inside square brackets is optional and can be excluded from the query. For example, the SQL query below executes an INNER JOIN
:
SELECT *
FROM left_table
JOIN right_table
ON conditional expression;
4. What type of JOIN is executed without conditional expression
in the explict and implict join notations shown below?
Explicit join:
SELECT *
FROM [left table]
[JOIN CLAUSE] [right table]
Implict join:
SELECT *
FROM [left table], [right table]
Answer
Let's set up a SQLite database with tables. When you try to connect to a database that doesn't exist, it's created for you. First:
pip install ipython-sql
Then create the database and tables:
* sqlite:///joins.db
Done.
1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.
Done.
1 rows affected.
1 rows affected.
1 rows affected.
Done.
1 rows affected.
1 rows affected.
1 rows affected.
Done.
1 rows affected.
1 rows affected.
1 rows affected.
We'll use the student_contact
as the left table and staff_contact
as the right table to answer this question. Let's see them:
* sqlite:///joins.db
Done.
student_id | email_address |
---|---|
1 | [email protected] |
2 | [email protected] |
3 | [email protected] |
* sqlite:///joins.db
Done.
staff_id | email_address |
---|---|
1 | [email protected] |
2 | [email protected] |
3 | [email protected] |
If we don't specify the join condition, SQL does it for us. It assumes a CROSS JOIN
. If the left table has n
rows and y
columns, and the right table has m
rows and z
columns, a CROSS JOIN
will return y + z
columns and m x n
rows. Our example will return:
student_id | email_address | staff_id | email_address |
---|---|---|---|
1 | [email protected] | 1 | [email protected] |
1 | [email protected] | 2 | [email protected] |
1 | [email protected] | 3 | [email protected] |
2 | [email protected] | 1 | [email protected] |
2 | [email protected] | 2 | [email protected] |
2 | [email protected] | 3 | [email protected] |
3 | [email protected] | 1 | [email protected] |
3 | [email protected] | 2 | [email protected] |
3 | [email protected] | 3 | [email protected] |
This is true for JOIN clauses that don't have conditional expressions built into them. The NATURAL JOIN
has an equality
expression built into it and will not result in a CROSS JOIN
because we don't even need to specify the conditional expression
when using it.
Let's evaluate whether this is true for the the following explict joins: INNER JOIN
, LEFT JOIN
, RIGHT JOIN
, and FULL JOIN
:
* sqlite:///joins.db
Done.
student_id | email_address | staff_id | email_address_1 |
---|---|---|---|
1 | [email protected] | 1 | [email protected] |
1 | [email protected] | 2 | [email protected] |
1 | [email protected] | 3 | [email protected] |
2 | [email protected] | 1 | [email protected] |
2 | [email protected] | 2 | [email protected] |
2 | [email protected] | 3 | [email protected] |
3 | [email protected] | 1 | [email protected] |
3 | [email protected] | 2 | [email protected] |
3 | [email protected] | 3 | [email protected] |
* sqlite:///joins.db
Done.
student_id | email_address | staff_id | email_address_1 |
---|---|---|---|
1 | [email protected] | 1 | [email protected] |
1 | [email protected] | 2 | [email protected] |
1 | [email protected] | 3 | [email protected] |
2 | [email protected] | 1 | [email protected] |
2 | [email protected] | 2 | [email protected] |
2 | [email protected] | 3 | [email protected] |
3 | [email protected] | 1 | [email protected] |
3 | [email protected] | 2 | [email protected] |
3 | [email protected] | 3 | [email protected] |
* sqlite:///joins.db
Done.
student_id | email_address | staff_id | email_address_1 |
---|---|---|---|
1 | [email protected] | 1 | [email protected] |
1 | [email protected] | 2 | [email protected] |
1 | [email protected] | 3 | [email protected] |
2 | [email protected] | 1 | [email protected] |
2 | [email protected] | 2 | [email protected] |
2 | [email protected] | 3 | [email protected] |
3 | [email protected] | 1 | [email protected] |
3 | [email protected] | 2 | [email protected] |
3 | [email protected] | 3 | [email protected] |
* sqlite:///joins.db
Done.
student_id | email_address | staff_id | email_address_1 |
---|---|---|---|
1 | [email protected] | 1 | [email protected] |
1 | [email protected] | 2 | [email protected] |
1 | [email protected] | 3 | [email protected] |
2 | [email protected] | 1 | [email protected] |
2 | [email protected] | 2 | [email protected] |
2 | [email protected] | 3 | [email protected] |
3 | [email protected] | 1 | [email protected] |
3 | [email protected] | 2 | [email protected] |
3 | [email protected] | 3 | [email protected] |
Let's evaluate whether this is also true for the implicit join:
* sqlite:///joins.db
Done.
student_id | email_address | staff_id | email_address_1 |
---|---|---|---|
1 | [email protected] | 1 | [email protected] |
1 | [email protected] | 2 | [email protected] |
1 | [email protected] | 3 | [email protected] |
2 | [email protected] | 1 | [email protected] |
2 | [email protected] | 2 | [email protected] |
2 | [email protected] | 3 | [email protected] |
3 | [email protected] | 1 | [email protected] |
3 | [email protected] | 2 | [email protected] |
3 | [email protected] | 3 | [email protected] |
Lastly, let's evaluate whether this is true for a NATURAL JOIN
:
* sqlite:///joins.db
Done.
student_id | email_address | staff_id |
---|
NATURAL JOIN
does not return a CROSS JOIN
because of its in-built equality conditional. It searches for the same column name(s) in the left and right tables and applies the equality (=
) conditional. For this example, email_address
is the similar column in both tables. However, there are no matching email addresses in the tables, so an empty table is returned.
5. How is the INNER JOIN executed?
Answer
The INNER JOIN
returns the matching values in both the right and left tables. A CROSS JOIN
is first executed, then the conditional expression specified is used to limit the rows returned. Let's illustrate with the students
and student_contact
tables.
The cross join of these tables returns:
* sqlite:///joins.db
Done.
student_id | first_name | last_name | student_id_1 | email_address |
---|---|---|---|---|
1 | Mary | Wilson | 1 | [email protected] |
1 | Mary | Wilson | 2 | [email protected] |
1 | Mary | Wilson | 3 | [email protected] |
2 | Tim | Ben | 1 | [email protected] |
2 | Tim | Ben | 2 | [email protected] |
2 | Tim | Ben | 3 | [email protected] |
3 | Alice | Robinson | 1 | [email protected] |
3 | Alice | Robinson | 2 | [email protected] |
3 | Alice | Robinson | 3 | [email protected] |
4 | Reece | Bells | 1 | [email protected] |
4 | Reece | Bells | 2 | [email protected] |
4 | Reece | Bells | 3 | [email protected] |
Let's assume we're using the equality
conditional. Then rows matching in the student_id
columns from both tables are returned:
student_id | first_name | last_name | student_id | email_address |
---|---|---|---|---|
1 | Mary | Wilson | 1 | [email protected] |
2 | Tim | Ben | 2 | [email protected] |
3 | Alice | Robinson | 3 | [email protected] |
There is no matching value for Reece Bells on both tables. So, values from this row are not returned. This SQL query can be evaluated using an INNER JOIN
as:
* sqlite:///joins.db
Done.
student_id | first_name | last_name | student_id_1 | email_address |
---|---|---|---|---|
1 | Mary | Wilson | 1 | [email protected] |
2 | Tim | Ben | 2 | [email protected] |
3 | Alice | Robinson | 3 | [email protected] |
6. How is the LEFT OUTER JOIN executed?
Answer
First LEFT OUTER JOIN
returns all the rows in the left table:
student_id | first_name | last_name | student_id | email_address |
---|---|---|---|---|
1 | Mary | Wilson | - | - |
2 | Tim | Ben | - | - |
3 | Alice | Robinson | - | - |
4 | Reece | Bells | - | - |
Next, matching values from CROSS JOIN with
the right tables are also returned:
student_id | first_name | last_name | student_id | email_address |
---|---|---|---|---|
1 | Mary | Wilson | 1 | [email protected] |
2 | Tim | Ben | 2 | [email protected] |
3 | Alice | Robinson | 3 | [email protected] |
4 | Reece | Bells | - | - |
Finally, unmatched rows from the left table are added to the result. Their values are padded with either NULL
or None
:
student_id | first_name | last_name | student_id | email_address |
---|---|---|---|---|
1 | Mary | Wilson | 1 | [email protected] |
2 | Tim | Ben | 2 | [email protected] |
3 | Alice | Robinson | 3 | [email protected] |
4 | Reece | Bells | NULL | NULL |
The SQL query for a LEFT OUTER JOIN
with equality
conditional is shown below:
* sqlite:///joins.db
Done.
student_id | first_name | last_name | student_id_1 | email_address |
---|---|---|---|---|
1 | Mary | Wilson | 1 | [email protected] |
2 | Tim | Ben | 2 | [email protected] |
3 | Alice | Robinson | 3 | [email protected] |
4 | Reece | Bells | None | None |
7. How is the FULL OUTER JOIN executed?
Answer
The FULL OUTER JOIN
returns all rows from the left and the right tables, regardless of whether the conditional expression is matched. To show how this works, we'll create two additional tables. The courses
table lists the courses presently being offered in school, and the last_enrolment
table lists the courses students last enrolled in.
* sqlite:///joins.db
Done.
1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.
Done.
1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.
Let's view these tables:
* sqlite:///joins.db
Done.
course_id | course_name |
---|---|
100 | Advanced Physics |
200 | Computer Science |
300 | Economics |
400 | Quantum Computing |
500 | Cryptography |
* sqlite:///joins.db
Done.
student_id | course_id |
---|---|
2 | 500 |
1 | 500 |
3 | 400 |
2 | 400 |
4 | 111 |
Let's get back to how the FULL OUTER JOIN
works. First, the CROSS JOIN
is evaluated:
* sqlite:///joins.db
Done.
student_id | course_id | course_id_1 | course_name |
---|---|---|---|
2 | 500 | 100 | Advanced Physics |
2 | 500 | 200 | Computer Science |
2 | 500 | 300 | Economics |
2 | 500 | 400 | Quantum Computing |
2 | 500 | 500 | Cryptography |
1 | 500 | 100 | Advanced Physics |
1 | 500 | 200 | Computer Science |
1 | 500 | 300 | Economics |
1 | 500 | 400 | Quantum Computing |
1 | 500 | 500 | Cryptography |
3 | 400 | 100 | Advanced Physics |
3 | 400 | 200 | Computer Science |
3 | 400 | 300 | Economics |
3 | 400 | 400 | Quantum Computing |
3 | 400 | 500 | Cryptography |
2 | 400 | 100 | Advanced Physics |
2 | 400 | 200 | Computer Science |
2 | 400 | 300 | Economics |
2 | 400 | 400 | Quantum Computing |
2 | 400 | 500 | Cryptography |
4 | 111 | 100 | Advanced Physics |
4 | 111 | 200 | Computer Science |
4 | 111 | 300 | Economics |
4 | 111 | 400 | Quantum Computing |
4 | 111 | 500 | Cryptography |
Next, the entire row of the left table, last_enrolment
, is returned:
student_id | course_id |
---|---|
2 | 500 |
1 | 500 |
3 | 400 |
2 | 400 |
4 | 111 |
Then the matching values in the right table, courses
, are returned from the CROSS JOIN
:
student_id | course_id | course_id | course_name |
---|---|---|---|
2 | 500 | 500 | Cryptography |
1 | 500 | 500 | Crypography |
3 | 400 | 400 | Quantum Computing |
2 | 400 | 400 | Quantum Computing |
4 | 111 | - | - |
Next, the non-matching rows in the right table are returned:
student_id | course_id | course_id | course_name |
---|---|---|---|
2 | 500 | 500 | Cryptography |
1 | 500 | 500 | Crypography |
3 | 400 | 400 | Quantum Computing |
2 | 400 | 400 | Quantum Computing |
4 | 111 | - | - |
- | - | 100 | Advanced Physics |
- | - | 200 | Computer Science |
- | - | 300 | Economics |
Finally, the empty cells are padded with NULL
or None
:
student_id | course_id | course_id | course_name |
---|---|---|---|
2 | 500 | 500 | Cryptography |
1 | 500 | 500 | Crypography |
3 | 400 | 400 | Quantum Computing |
2 | 400 | 400 | Quantum Computing |
4 | 111 | NULL | NULL |
NULL | NULL | 100 | Advanced Physics |
NULL | NULL | 200 | Computer Science |
NULL | NULL | 300 | Economics |
Let's confirm that our result is correct by running a FULL OUTER JOIN
query:
* sqlite:///joins.db
Done.
student_id | course_id | course_id_1 | course_name |
---|---|---|---|
2 | 500 | 500 | Cryptography |
1 | 500 | 500 | Cryptography |
3 | 400 | 400 | Quantum Computing |
2 | 400 | 400 | Quantum Computing |
4 | 111 | None | None |
None | None | 100 | Advanced Physics |
None | None | 200 | Computer Science |
None | None | 300 | Economics |
8. How do you eliminate redundant columns from SQL JOINs?
Answer
The first way to eliminate redundant columns when running JOIN queries is to use the USING
clause as our conditional expression:
SELECT *
FROM left_table
[JOIN CLAUSE] right_table
USING [col_name, ..];
Let's first run an INNER JOIN
query:
* sqlite:///joins.db
Done.
student_id | first_name | last_name | student_id_1 | email_address |
---|---|---|---|---|
1 | Mary | Wilson | 1 | [email protected] |
2 | Tim | Ben | 2 | [email protected] |
3 | Alice | Robinson | 3 | [email protected] |
There are duplicate student_id
columns. Let's eliminate this redundancy with the USING
clause:
* sqlite:///joins.db
Done.
student_id | first_name | last_name | email_address |
---|---|---|---|
1 | Mary | Wilson | [email protected] |
2 | Tim | Ben | [email protected] |
3 | Alice | Robinson | [email protected] |
You can see that the duplicate column has been removed.
The second way to eliminate the redundant column is to use the NATURAL JOIN
. The NATURAL JOIN
clause is semantically similar to the INNER JOIN .. USING
clause, in which the conditional expression is equality.
Let try NATURAL JOIN
with the previous example:
* sqlite:///joins.db
Done.
student_id | first_name | last_name | email_address |
---|---|---|---|
1 | Mary | Wilson | [email protected] |
2 | Tim | Ben | [email protected] |
3 | Alice | Robinson | [email protected] |
A third way to eliminate duplicate columns is to list the column names you want displayed in SELECT
:
* sqlite:///joins.db
Done.
student_id | first_name | last_name | email_address |
---|---|---|---|
1 | Mary | Wilson | [email protected] |
2 | Tim | Ben | [email protected] |
3 | Alice | Robinson | [email protected] |
So far, we've seen how the SQL JOIN clauses are used to combine two tables. Some of the SQL JOIN interview questions that we'll see next will require combining and filtering data from more than two tables. Writing SQL JOIN queries for these types of questions may be challenging. Take the Combining Tables in SQL Course to reinforce your knowledge of how SQL JOINS work and prepare yourself for the next questions.
9. How is a UNION clause different from a JOIN clause?
Answer
The JOIN and UNION clauses are used to combine data from two or more tables. With the JOIN clause, the columns and matching rows from both tables are returned. The number of columns in the combined table is the sum of the number of columns on both tables.
UNION combines data from tables by stacking them vertically. The number of columns in the SELECT
statement and the order of their data types must be the same. The staff
and students
tables can be combined with UNION
as follows:
* sqlite:///joins.db
Done.
staff_id | first_name | last_name |
---|---|---|
1 | Ada | Lovelace |
1 | Mary | Wilson |
2 | Adam | Smith |
2 | Tim | Ben |
3 | Alice | Robinson |
3 | Nikolo | Tesla |
4 | Reece | Bells |
* sqlite:///joins.db
Done.
first_name | last_name |
---|---|
Ada | Lovelace |
Adam | Smith |
Alice | Robinson |
Mary | Wilson |
Nikolo | Tesla |
Reece | Bells |
Tim | Ben |
10. What do you understand by Non-EQUI JOIN?
Answer
Non-EQUI JOINs combine tables using other conditional operators besides equality. The other operators may include: less than (<
), less than or equals (<=
), greater than (>
), greater than or equals (>=
), not equals (<>
), and BETWEEN
. For example:
* sqlite:///joins.db
Done.
student_id | course_id | course_id_1 | course_name |
---|---|---|---|
2 | 500 | 100 | Advanced Physics |
2 | 500 | 200 | Computer Science |
1 | 500 | 100 | Advanced Physics |
1 | 500 | 200 | Computer Science |
3 | 400 | 100 | Advanced Physics |
3 | 400 | 200 | Computer Science |
2 | 400 | 100 | Advanced Physics |
2 | 400 | 200 | Computer Science |
11. How will you join a table to itself?
Answer
You can join a table to itself if it contains related columns. Let's assume that Ada Lovelace is the supervisor to the other staff. We'll add a supervisor column with Ada Lovelace's staff_id
. The supervisor
and staff_id
columns are related.
* sqlite:///joins.db
Done.
staff_id | first_name | last_name |
---|---|---|
1 | Ada | Lovelace |
2 | Adam | Smith |
3 | Nikolo | Tesla |
* sqlite:///joins.db
Done.
Done.
staff_id | first_name | last_name | supervisor |
---|---|---|---|
1 | Ada | Lovelace | None |
2 | Adam | Smith | None |
3 | Nikolo | Tesla | None |
* sqlite:///joins.db
Done.
Done.
staff_id | first_name | last_name | supervisor |
---|---|---|---|
1 | Ada | Lovelace | None |
2 | Adam | Smith | 1 |
3 | Nikolo | Tesla | 1 |
Let's perform self join on this table to list the staff and their supervisor:
* sqlite:///joins.db
Done.
staff_id | first_name | last_name | supervisor | staff_id_1 | first_name_1 | last_name_1 | supervisor_1 |
---|---|---|---|---|---|---|---|
1 | Ada | Lovelace | None | None | None | None | None |
2 | Adam | Smith | 1 | 1 | Ada | Lovelace | None |
3 | Nikolo | Tesla | 1 | 1 | Ada | Lovelace | None |
You can observe that the left table, s1
, contains the information for the staff and the right table; s2
contains the information about their supervisors. Let's make things clearer by specifying column names in SELECT
:
* sqlite:///joins.db
Done.
staff_id | first_name | last_name | supervisor_first_name | supervisor_last_name |
---|---|---|---|---|
1 | Ada | Lovelace | None | None |
2 | Adam | Smith | Ada | Lovelace |
3 | Nikolo | Tesla | Ada | Lovelace |
Let's drop the supervisor column we added to the staff
table:
* sqlite:///joins.db
Done.
Done.
staff_id | first_name | last_name |
---|---|---|
1 | Ada | Lovelace |
2 | Adam | Smith |
3 | Nikolo | Tesla |
12. What do you understand about Nested JOINs?
Answer
Nested JOIN simply means combining more than two tables. Let's use nested join to create a table containing students' full information: id, names, contact, and courses last enrolled.
* sqlite:///joins.db
Done.
student_id | first_name | last_name | email_address | course_id | course_name |
---|---|---|---|---|---|
1 | Mary | Wilson | [email protected] | 500 | Cryptography |
2 | Tim | Ben | [email protected] | 500 | Cryptography |
2 | Tim | Ben | [email protected] | 400 | Quantum Computing |
3 | Alice | Robinson | [email protected] | 400 | Quantum Computing |
4 | Reece | Bells | None | 111 | None |
At the start, the students
and student_contact
tables are the left and right tables respectively. When these two tables are joined, the resulting table is the left table for the next join operation with the last_enrolment
table, which is the right table. The resulting table from this join operation is the left table when joining with the courses
table.
13. How will you use a subquery to execute INNER JOINs for more than two tables?
Answer
Both subquery and join can be used to combine data from two or more tables. Subqueries can be used to perform inner join operations if the equality conditional operator is used with the WHERE
clause. This is an example of implict join:
* sqlite:///joins.db
Done.
student_id | first_name | last_name | student_id:1 | email_address | student_id:2 | course_id | course_id:1 | course_name |
---|---|---|---|---|---|---|---|---|
1 | Mary | Wilson | 1 | [email protected] | 1 | 500 | 500 | Cryptography |
2 | Tim | Ben | 2 | [email protected] | 2 | 500 | 500 | Cryptography |
2 | Tim | Ben | 2 | [email protected] | 2 | 400 | 400 | Quantum Computing |
3 | Alice | Robinson | 3 | [email protected] | 3 | 400 | 400 | Quantum Computing |
There are duplicate columns in the result. These can be removed by listing the names of the columns in the outermost SELECT
statement.
If you're having a hard time understanding how subqueries work, our SQL Subqueries course will help you get started. If you're new to SQL and you're having difficulties understanding the SQL queries, we recommend that you learn the basics from our SQL Fundamentals Skill Path.
14. Write the SQL queries to execute the operations shown in the diagrams
Answer
Diagram 1 set notation is $A - B$, read as A difference B. We want all the elements that belong to A but not B.
Let's assume A is students
table and B is student_contact
. We want to return only records that are in students
but not in student_contact
.
Let's view the tables:
* sqlite:///joins.db
Done.
student_id | first_name | last_name |
---|---|---|
1 | Mary | Wilson |
2 | Tim | Ben |
3 | Alice | Robinson |
4 | Reece | Bells |
* sqlite:///joins.db
Done.
student_id | email_address |
---|---|
1 | [email protected] |
2 | [email protected] |
3 | [email protected] |
Only Reece Bells
satisfies the condition in diagram 1. The query is given below:
* sqlite:///joins.db
Done.
student_id | first_name | last_name | email_address |
---|---|---|---|
4 | Reece | Bells | None |
LEFT JOIN
returns all the rows in students
table, and student_contact.student_id IS NULL
sets the intersecting rows to NULL.
Diagram 2 set notation is $(A \cap B)^c$. This is the elements of both A and B that are not common to them. We'll use the last_enrolment
and courses
tables.
Let's view the tables:
* sqlite:///joins.db
Done.
student_id | course_id |
---|---|
2 | 500 |
1 | 500 |
3 | 400 |
2 | 400 |
4 | 111 |
* sqlite:///joins.db
Done.
course_id | course_name |
---|---|
100 | Advanced Physics |
200 | Computer Science |
300 | Economics |
400 | Quantum Computing |
500 | Cryptography |
The common keys are 400 and 500. These will be excluded. The returned table contains 111, 100, 200, and 300. The query is shown below:
* sqlite:///joins.db
Done.
student_id | course_id | course_name |
---|---|---|
4 | 111 | None |
None | 100 | Advanced Physics |
None | 200 | Computer Science |
None | 300 | Economics |
15. Discuss a common pitfall with using JOIN
Answer
There are different types of JOINs. We may need to implement a particular type of JOIN to answer our question correctly. As you're already aware, if you don't specify the conditional expression, your SQL query will not fail. It will return a CROSS JOIN
. If you go on to perform your analysis on the CROSS JOIN
, you'll get an incorrect result. You'll also get incorrect results when you don't join your tables with the appropriate conditional expressions.
You'll also get incorrect results if you incorrectly filter your JOIN with the WHERE
clause. Incorrectly filtering can result in an unintended type of JOIN. For example, a LEFT JOIN
can be transformed to an INNER JOIN
with an incorrect WHERE
clause.
Let's see the LEFT JOIN
result:
* sqlite:///joins.db
Done.
student_id | first_name | last_name | student_id_1 | email_address |
---|---|---|---|---|
1 | Mary | Wilson | 1 | [email protected] |
2 | Tim | Ben | 2 | [email protected] |
3 | Alice | Robinson | 3 | [email protected] |
4 | Reece | Bells | None | None |
Now let's incorrectly filter the LEFT JOIN
to transform it into an INNER JOIN
:
* sqlite:///joins.db
Done.
student_id | first_name | last_name | student_id_1 | email_address |
---|---|---|---|---|
1 | Mary | Wilson | 1 | [email protected] |
2 | Tim | Ben | 2 | [email protected] |
3 | Alice | Robinson | 3 | [email protected] |
Logical errors are introduced into our program this way. Logical errors do not return error messages, making them difficult to detect — especially when we're working with large tables.
Close the connection to the joins.db
database:
16. How will you structure data for facilitating JOIN in a one-to-many relationship condition?
Answer
In a one-to-many relationship, one record in table A can be associated to more than one record in table B. So the primary key of A is a foreign key in B. The students
and last_enrolment
tables share a one-to-many relationship, as a student can be enrolled in more than one course.
-- create a students table:
CREATE TABLE students (
student_id INTEGER PRIMARY KEY,
first_name TEXT NOT NULL,
last_name TEXT NOT NULL
)
--create last_enrolment table
CREATE TABLE last_enrolment (
student_id,
course_id,
FOREIGN KEY (student_id) REFERENCES students(student_id),
FOREIGN KEY (course_id) REFERENCES courses(course_id)
);
In the above code snippet, the student_id
PRIMARY KEY in the students
table is set as FOREIGN KEY in the last_enrolment
table.
17. How will you structure data for facilitating JOIN in a many-to-many relationship condition?
Answer
In a many-to-many relationship, multiple records in table A are associated with multiple records in table B. There is a many-to-many relationship between the students
and courses
tables. A student can take more than one course, and there can be more than one student in a course.
The many-to-many relationship in our example is broken into two one-to-many relationships. The first one-to-many relationship is the relationship between students
and last_enrolment
, and the second one-to-many relationship is the relationship between courses
and last_enrolment
. The last_enrolment
table connects the students
and courses
tables and is referred to as a join table.
-- create a students table:
CREATE TABLE students (
student_id INTEGER PRIMARY KEY,
first_name TEXT NOT NULL,
last_name TEXT NOT NULL
)
-- CREATE the courses table
CREATE TABLE courses (
course_id INTEGER PRIMARY KEY,
course_name TEXT NOT NULL
)
--create last_enrolment table
CREATE TABLE last_enrolment (
student_id,
course_id,
FOREIGN KEY (student_id) REFERENCES students(student_id),
FOREIGN KEY (course_id) REFERENCES courses(course_id)
);
18. Write a query to return information on people that love Jazz music
Write a query to return the email
and full_names
of people that have bought Jazz music. You'll be working with the chinook.db
. You can find the schema and download the file from Answer
First, we connect to the database and list the tables in it:
* sqlite:///chinook.db
Done.
name
album
artist
customer
employee
genre
invoice
invoice_line
media_type
playlist
playlist_track
track
We'll return the email
, first_name
, and last_name
columns on the customer
table for the subset where the genre
is Jazz. We want to connect the customer
table to the genre
table. The schema will be helpful here. In the schema, we'll need to join the customer
, invoice
, invoice_line
, track
, and genre
tables to get the information we want.
* sqlite:///chinook.db
Done.
It's possible for a customer to buy several Jazz music items. So the customer details will be returned as many times as they bought Jazz music. We've used the DISTINCT
keyword to return only one detail for this type of customer.
19. Write a query to return the top 5 spenders on Jazz and Rock
Answer
We'll continue to work with the chinook.db
. We need to join the customer
and genre
tables and filter by the type of genre. We'll be returning the customer full_name
, and the amount_spent
. To get the amount spent, we'll look at the invoice_line
table for the unit_price
and quantity
purchased.
A single customer can buy music from a genre more than once. Their information will apply multiple times. You need to GROUP BY
the customer full_name
and SUM
the values from the product of unit_price
and quantity
to get the total amount spent on that genre:
* sqlite:///chinook.db
Done.
full_name
amount_spent
Fernanda Ramos
15.84
Enrique Muñoz
15.84
Hannah Schneider
14.85
Astrid Gruber
14.85
Kara Nielsen
3.96
* sqlite:///chinook.db
Done.
full_name
amount_spent
Helena Holý
75.24
François Tremblay
74.25
Luís Gonçalves
71.28
Hugh O'Reilly
71.28
João Fernandes
67.32
You can easily get the amount_spent
values wrong if you use the total
column in the invoice
table. This is because a customer can buy songs from more than one genre on the same invoice, and the total
column will contain the full amount spent, not the amount spent on a particular genre.
This is demonstrated in the query below. Ellie Sullivan bought 10 songs from different genres on 2017-04-16. The total amount spent on all the songs is 9.9, which is different from the amount spent on a particular genre.
* sqlite:///chinook.db
Done.
full_name
name
unit_price
quantity
total
Ellie Sullivan
Metal
0.99
1
9.9
Ellie Sullivan
Rock
0.99
1
9.9
Ellie Sullivan
Rock
0.99
1
9.9
Ellie Sullivan
Rock
0.99
1
9.9
Ellie Sullivan
Rock
0.99
1
9.9
Ellie Sullivan
Rock
0.99
1
9.9
Ellie Sullivan
Rock
0.99
1
9.9
Ellie Sullivan
Rock
0.99
1
9.9
Ellie Sullivan
Metal
0.99
1
9.9
Ellie Sullivan
Alternative & Punk
0.99
1
9.9
20. Simplify the query to return the top 5 spenders on Jazz with CTE
CTE stands for Common Table Expression. It enables us to simplify complex queries by deconstructing them into simple blocks. We initiate a CTE with the WITH
clause. The SQL query that answers the previous question is a bit complex. We can make the query more readable by performing most of the join operations with CTE.
* sqlite:///chinook.db
Done.
full_name
amount_spent
Fernanda Ramos
15.84
Enrique Muñoz
15.84
Hannah Schneider
14.85
Astrid Gruber
14.85
Kara Nielsen
3.96
Close connection to chinook.db
:
Takeaway
In this tutorial, we discussed the top 20 SQL JOINs interview questions and provided a step-by-step guide to their answers. If you aspire to become a business analyst, data analyst, or data scientist, knowledge of SQL is one of the leading requirements for your role. No two ways about it: you must learn SQL programming.
As you work with relational databases, one of the most common SQL operations you'll perform is joining tables. This is why SQL JOIN questions are so popular with interviewers. At your SQL interviews, you'll get several questions on SQL JOINs to test your knowledge of how they work. We've provided answers to some of the most common questions you may be asked.
If you haven't learned the basics of SQL yet, this tutorial may not be helpful. Learn how to explore, query, join, and filter data from relational databases in our SQL Fundamentals Skill Path. You'll master the SQL syntaxes and basic commands that you'll need to start working with relational databases. On our interactive learning platform, you get to practice writing your own SQL queries.
We understand that porfolio projects are very important, and they help showcase your SQL programming and data analysis skills to potential recruiters. You get to complete real-life SQL projects on the SQL Fundamentals Skill Path. We've also curated 10 exiciting SQL project ideas to help you get more practice and projects in your SQL portfolio.