SQL table joins

Photo by Amol Tyagi on Unsplash

Photo by Amol Tyagi on Unsplash

When working on relational databases it's necessary to join related tables to get all the data we need for our applications. Many beginners quickly learn that you use a left join if you want to join two tables. But SQL has a lot more to offer than a simple left join.

In this article, I will try to explain the difference between the various table joins, what makes them unique, and where you would use them.

Please note that these examples are based on MySQL. Other database management systems may not have the same SQL syntax.

The Tables

What the tables look like

To have some tables to work with as examples I created three tables that together work as a simplification of a youtube clone. Where there is a table with users, a table with videos, and a table with comments on the videos.

ERD showing a many-to-one relation between the video and user table, and many-to-one relation between the video and comment table.

ER-diagrom for the example database.

Below you can see details on the tables:

mysql> EXPLAIN user;
+-------+--------------+------+-----+---------+----------------+
| Field | Type         | Null | Key | Default | Extra          |
+-------+--------------+------+-----+---------+----------------+
| id    | int          | NO   | PRI | NULL    | auto_increment |
| name  | varchar(255) | NO   |     | NULL    |                |
+-------+--------------+------+-----+---------+----------------+
2 rows in set (0.00 sec)
mysql> EXPLAIN video;
+------------+--------------+------+-----+---------+----------------+
| Field      | Type         | Null | Key | Default | Extra          |
+------------+--------------+------+-----+---------+----------------+
| id         | int          | NO   | PRI | NULL    | auto_increment |
| user_id    | int          | NO   | MUL | NULL    |                |
| video_name | varchar(255) | NO   |     | NULL    |                |
+------------+--------------+------+-----+---------+----------------+
3 rows in set (0.00 sec)
mysql> EXPLAIN comment;
+----------+--------------+------+-----+---------+----------------+
| Field    | Type         | Null | Key | Default | Extra          |
+----------+--------------+------+-----+---------+----------------+
| id       | int          | NO   | PRI | NULL    | auto_increment |
| user_id  | int          | NO   | MUL | NULL    |                |
| video_id | int          | NO   | MUL | NULL    |                |
| comment  | varchar(255) | NO   |     | NULL    |                |
+----------+--------------+------+-----+---------+----------------+
4 rows in set (0.00 sec)

To create these tables, the following SQL was used:

create_tables.sql
CREATE DATABASE joins_training;
USE joins_training;

CREATE TABLE user (
    id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
    name VARCHAR(255) NOT NULL
);

CREATE TABLE IF NOT EXISTS video (
    id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
    name VARCHAR(255) NOT NULL,
    user_id INT NOT NULL,
    CONSTRAINT fk_video_user_id
        FOREIGN KEY (user_id)
        REFERENCES user(id)
);

CREATE TABLE comment (
    id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
    comment VARCHAR(255) NOT NULL,
    user_id INT NOT NULL,
    video_id INT NOT NULL,
    CONSTRAINT fk_comment_user_id
        FOREIGN KEY (user_id)
        REFERENCES user(id),
    CONSTRAINT fk_comment_video_id
        FOREIGN KEY (video_id)
        REFERENCES video(id)
);

The Data

To have some data to work with I filled in the following data into the tables.

mysql> SELECT * FROM user;
+----+-------+
| id | name  |
+----+-------+
|  1 | Alice |
|  2 | Bob   |
|  3 | Eve   |
|  4 | John  |
|  5 | Mary  |
+----+-------+
5 rows in set (0.00 sec)
mysql> SELECT * FROM video;
+----+------------------+---------+
| id | name             | user_id |
+----+------------------+---------+
|  1 | CTFs are fun!    |       1 |
|  2 | CTFs are hard... |       1 |
|  3 | Cute cat         |       2 |
|  4 | Scary dog        |       3 |
|  5 | Cool PS1 games   |       4 |
+----+------------------+---------+
5 rows in set (0.00 sec)
mysql> SELECT * FROM comment;
+----+----------------------+---------+----------+
| id | comment              | user_id | video_id |
+----+----------------------+---------+----------+
|  1 | Wow!                 |       5 |        1 |
|  2 | Hang in there!       |       5 |        2 |
|  3 | OMG! Sooo cute       |       1 |        3 |
|  4 | So fluffy!           |       2 |        3 |
|  5 | But... dogs er nice! |       1 |        4 |
+----+----------------------+---------+----------+
5 rows in set (0.00 sec)

To fill in this data, the following SQL was run:

data.sql
USE joins_training;

INSERT INTO user (name) 
VALUES 
    ('Alice'), 
    ('Bob'), 
    ('Eve'),
    ('John'), 
    ('Mary'); 

INSERT INTO video (name, user_id)
VALUES
    ('CTFs are fun!', 1),
    ('CTFs are hard...', 1),
    ('Cute cat', 2),
    ('Scary dog', 3),
    ('Cool PS1 games', 4);

INSERT INTO comment (comment, user_id, video_id)
VALUES
    ('Wow!', 5, 1),
    ('Hang in there!', 5, 2),
    ('OMG! Sooo cute', 1, 3),
    ('So fluffy!', 2, 3),
    ('But... dogs er nice!', 1, 4);

With this data entered intro our tables, we can finally start exploring the differences between the various joins.

Left Joins

Adding data from the right table, where needed

Left joins is perhaps the most common join to be used in the real world. With this join you first run a query to get the correct rows, and then you add columns from another table that match a criteria.

A simple left join is to list all videos, and then replace the user_id with the name of the user who created the video.

video_left_join_user.sql
SELECT
    video.id,
    video.name AS video_name,
    user.name AS user_name
FROM video
LEFT JOIN user
    ON video.user_id = user.id;
mysql> source C:/sql/video_left_join_user.sql;
+----+------------------+-----------+
| id | video_name       | user_name |
+----+------------------+-----------+
|  1 | CTFs are fun!    | Alice     |
|  2 | CTFs are hard... | Alice     |
|  3 | Cute cat         | Bob       |
|  4 | Scary dog        | Eve       |
|  5 | Cool PS1 games   | John      |
+----+------------------+-----------+
5 rows in set (0.00 sec)

The output above will show all videos, and what user made them.

Another more visual use of a left join is when joining all the data from the two tables. When using the left outer join, every row from the left table will be included, and where there is no match in the right table the result will show as NULL.

video_left_join_comment.sql
SELECT 
    * 
FROM video 
LEFT JOIN comment 
    ON video.id = comment.user_id;
mysql> source C:/sql/video_left_join_comment.sql;
+----+------------------+---------+------+----------------------+---------+----------+
| id | name             | user_id | id   | comment              | user_id | video_id |
+----+------------------+---------+------+----------------------+---------+----------+
|  1 | CTFs are fun!    |       1 |    3 | OMG! Sooo cute       |       1 |        3 |
|  1 | CTFs are fun!    |       1 |    5 | But... dogs er nice! |       1 |        4 |
|  2 | CTFs are hard... |       1 |    4 | So fluffy!           |       2 |        3 |
|  3 | Cute cat         |       2 | NULL | NULL                 |    NULL |     NULL |
|  4 | Scary dog        |       3 | NULL | NULL                 |    NULL |     NULL |
|  5 | Cool PS1 games   |       4 |    1 | Wow!                 |       5 |        1 |
|  5 | Cool PS1 games   |       4 |    2 | Hang in there!       |       5 |        2 |
+----+------------------+---------+------+----------------------+---------+----------+
7 rows in set (0.00 sec)

Because there are some videos that don't have comments the comment fields will be NULL.

Right joins

Adding data from the left table, where needed

A right join is simply the exact same as a left join, except you do the ordering in reverse. I never use this join, simply because I find the SQL harder to read, compared to a left join.

user_right_join_video.sql
SELECT
    video.id,
    video.name AS video_name,
    user.name AS user_name
FROM user
RIGHT JOIN video
    ON user.id = video.user_id;
mysql> source C:/sql/user_right_join_video.sql
+------+------------------+---------+----+-------+
| id   | name             | user_id | id | name  |
+------+------------------+---------+----+-------+
|    1 | CTFs are fun!    |       1 |  1 | Alice |
|    2 | CTFs are hard... |       1 |  1 | Alice |
|    3 | Cute cat         |       2 |  2 | Bob   |
|    4 | Scary dog        |       3 |  3 | Eve   |
|    5 | Cool PS1 games   |       4 |  4 | John  |
| NULL | NULL             |    NULL |  5 | Mary  |
+------+------------------+---------+----+-------+
6 rows in set (0.00 sec)

The output above lists all users, and what videos they have made. As you can read from the output above, Mary has not made any videos.

Like with the left join, a more visual use of a right join is when joining all the data from the two tables. When using the right outer join, every row from the right table will be included, and where there is no match in the left table the result will show as NULL.

video_right_join_comment.sql
SELECT 
    * 
FROM video 
LEFT JOIN comment 
    ON video.id = comment.user_id;
mysql> source C:/sql/video_right_join_comment.sql;
+------+------------------+---------+----+----------------------+---------+----------+
| id   | name             | user_id | id | comment              | user_id | video_id |
+------+------------------+---------+----+----------------------+---------+----------+
|    5 | Cool PS1 games   |       4 |  1 | Wow!                 |       5 |        1 |
|    5 | Cool PS1 games   |       4 |  2 | Hang in there!       |       5 |        2 |
|    1 | CTFs are fun!    |       1 |  3 | OMG! Sooo cute       |       1 |        3 |
|    2 | CTFs are hard... |       1 |  4 | So fluffy!           |       2 |        3 |
|    1 | CTFs are fun!    |       1 |  5 | But... dogs er nice! |       1 |        4 |
+------+------------------+---------+----+----------------------+---------+----------+
5 rows in set (0.00 sec)

Because all comments are made to videos there will be no NULL values.

Inner join

Combining tables where there is a reference in both tables

Inner joins are probably the join that is the most difficult to understand. An inner join will join the tables and create a row combining the data from both tables only when there is a match in both tables. Eliminating any rows with NULL values. Let's run the join of the comments again.

video_inner_join_comment.sql
SELECT 
    * 
FROM video 
LEFT JOIN comment 
    ON video.id = comment.user_id;
mysql> source C:/sql/video_inner_join_comment.sql;
+----+------------------+---------+----+----------------------+---------+----------+
| id | name             | user_id | id | comment              | user_id | video_id |
+----+------------------+---------+----+----------------------+---------+----------+
|  1 | CTFs are fun!    |       1 |  3 | OMG! Sooo cute       |       1 |        3 |
|  1 | CTFs are fun!    |       1 |  5 | But... dogs er nice! |       1 |        4 |
|  2 | CTFs are hard... |       1 |  4 | So fluffy!           |       2 |        3 |
|  5 | Cool PS1 games   |       4 |  1 | Wow!                 |       5 |        1 |
|  5 | Cool PS1 games   |       4 |  2 | Hang in there!       |       5 |        2 |
+----+------------------+---------+----+----------------------+---------+----------+
5 rows in set (0.00 sec)

The utility of an inner join might not be so obvious when joining two tables having a many-to-one relationship, especially in an example like this when the comment is required to have a video_id assigned to it. It does become a lot more obvious when joining two tables having a many-to-many relationship, and when there is no required relationship between the tables.

I hope this short article made the differences between the various join types a bit less confusing 🥳