First, let’s talk about the keywords "inner" and "outer." They’re optional—INNER JOIN is the same as JOIN, and LEFT OUTER JOIN is the same as LEFT JOIN. These keywords are added for clarity because they make the joins easier to understand conceptually. Some developers leave them out, arguing there's no reason to have extra nonfunctional words in a database query. The most important thing is to be consistent. We’ll use them.
Inner joins give only the rows where all the joined tables have related data. If we inner join our customers and orders, we'll get all the related customers and orders. We won't get any customers without orders or any orders without customers.
SELECT first_name, phone, orders.cake_id, pickup_date
FROM customers INNER JOIN orders
ON customers.customer_id = orders.customer_id;
/*
+------------+------------+---------+-------------+
| first_name | phone | cake_id | pickup_date |
+------------+------------+---------+-------------+
| Linda | 8095550114 | 4 | 2017-10-12 |
| May | 8015550130 | 4 | 2017-02-03 |
| Frances | 8345550120 | 1 | 2017-09-16 |
| Matthew | 8095550122 | 3 | 2017-07-20 |
| Barbara | 8015550157 | 2 | 2017-07-07 |
...
*/
If we wanted the cake flavor, not just the cake ID, we could also join the cake table:
SELECT first_name, phone, cakes.flavor, pickup_date
FROM customers
INNER JOIN orders ON customers.customer_id = orders.customer_id
INNER JOIN cakes ON orders.cake_id = cakes.cake_id;
/*
+------------+------------+-----------+-------------+
| first_name | phone | flavor | pickup_date |
+------------+------------+-----------+-------------+
| Frances | 8345550120 | Chocolate | 2017-09-16 |
| Theodore | 8015550175 | Chocolate | 2017-08-13 |
| James | 8015550165 | Chocolate | 2017-10-12 |
| Kathleen | 8095550157 | Chocolate | 2017-09-24 |
| Jennifer | 8015550153 | Chocolate | 2017-06-22 |
...
*/
Left outer joins give all the rows from the first table, but only related rows in the next table. So if we run a left outer join on customers and orders, we'll get all the customers, and their orders if they have any.
SELECT cake_id, pickup_date, customers.customer_id, first_name
FROM orders LEFT OUTER JOIN customers
ON orders.customer_id = customers.customer_id
ORDER BY pickup_date;
/*
+---------+-------------+-------------+------------+
| cake_id | pickup_date | customer_id | first_name |
+---------+-------------+-------------+------------+
| 2 | 2017-01-01 | NULL | NULL |
| 3 | 2017-01-01 | 108548 | Eve |
| 1 | 2017-01-01 | 857831 | Neil |
| 4 | 2017-01-01 | NULL | NULL |
| 3 | 2017-01-01 | 168516 | Maria |
...
*/
Right outer joins include any related rows in the first table, and all the rows in the next table. Right outer joining our customers and orders would give the customer if there is one, and then every order.
In our schema, customer_id isn’t NOT NULL on orders. This may be seem unintuitive, but maybe we don’t require customers to register with us to place an order, or orders can be associated with other models like restaurant or vendor. In any case, with our schema, we can have orders without customers.
SELECT customers.customer_id, first_name, pickup_date
FROM customers RIGHT OUTER JOIN orders
ON customers.customer_id = orders.customer_id
ORDER BY pickup_date;
/*
+-------------+------------+-------------+
| customer_id | first_name | pickup_date |
+-------------+------------+-------------+
| NULL | NULL | 2017-01-01 |
| 108548 | Eve | 2017-01-01 |
| 857831 | Neil | 2017-01-01 |
| NULL | NULL | 2017-01-01 |
| NULL | NULL | 2017-01-01 |
...
*/
Right outer joins give the same result as left outer joins with the order of the tables switched:
SELECT customers.customer_id, first_name, pickup_date
FROM orders LEFT OUTER JOIN customers
ON customers.customer_id = orders.customer_id
ORDER BY pickup_date;
/*
same results as right outer join we just did!
+-------------+------------+-------------+
| customer_id | first_name | pickup_date |
+-------------+------------+-------------+
| NULL | NULL | 2017-01-01 |
| 108548 | Eve | 2017-01-01 |
| 857831 | Neil | 2017-01-01 |
| NULL | NULL | 2017-01-01 |
| NULL | NULL | 2017-01-01 |
...
*/
Full outer joins take all the records from every table. Related data are combined like the other joins, but no rows from any table are left out. For customers and orders, we'll get all the related customers and orders, and all the customers without orders, and all the orders without customers.
The standard SQL syntax is:
SELECT order_id, pickup_date, customers.customer_id, first_name
FROM orders FULL OUTER JOIN customers
ON orders.customer_id = customers.customer_id
But MySQL doesn't support full outer joins! No problem, we can get the same result with a UNION of left and right outer joins:
SELECT order_id, pickup_date, customers.customer_id, first_name
FROM orders LEFT OUTER JOIN customers
ON orders.customer_id = customers.customer_id
UNION
SELECT order_id, pickup_date, customers.customer_id, first_name
FROM orders RIGHT OUTER JOIN customers
ON orders.customer_id = customers.customer_id;
/*
+----------+-------------+-------------+------------+
| order_id | pickup_date | customer_id | first_name |
+----------+-------------+-------------+------------+
| 900075 | 2017-05-17 | NULL | NULL |
| 900079 | 2017-12-26 | 487996 | Frances |
| 900057 | 2017-10-25 | 498546 | Loretta |
| NULL | NULL | 640804 | Whitney |
| NULL | NULL | 58405 | Zoe |
...
*/
Using UNION or UNION ALL with this strategy generally emulates a full outer join. But things get complicated for some schemas, like if a column in the ON clause isn't NOT NULL.
Cross joins give every row from the first table paired with every row in the next table, ignoring any relationship. With customers and orders, we'd get every customer paired with every order. Cross joins are sometimes called Cartesian joins because they return the cartesian product of data sets—every combination of elements in every set.
This isn't used often because the results aren't usually useful. But sometimes you might actually need every combination of the rows in your tables, or you might need a large table for performance testing. If you cross join 2 tables with 10,000 rows each, you get a table with 100,000,000 rows!
Self joins refer to any join that joins data in the same table. For example, some of our customers were referred to our bakery by other customers. We could do a left outer join to get every customer and their referrer if they have one:
SELECT customer.first_name, referrer.first_name
FROM customers AS customer LEFT OUTER JOIN customers AS referrer
ON customer.referrer_id = referrer.customer_id;
/*
+------------+------------+
| first_name | first_name |
+------------+------------+
| Tim | NULL |
| Mattie | Wendy |
| Kurtis | NULL |
| Jared | NULL |
| Lucille | Tim |
...
*/