Introduction

Writing queries to extract answers to business questions is often not a straightforward task as the answers may be sitting at the intersection of various tables of the given database. This is a common phenomenon since most databases are designed using the relational database model. To paint a clearer picture of where I’m driving, consider a supermarket’s database, with the following tables and the fields in each table.

Customer table - customer_id, customer_name, segment, age, country, city, state, postal_code, region.

Screenshot 2022-01-06 at 14.51.07.png

Sales table - order_line, order_id, order_date, ship_date, ship_mode, customer_id, product_id, sales, quantity, discount, profit.

Screenshot 2022-01-06 at 14.57.04.png

Product table - product_id, category, sub_category, product_name.

Screenshot 2022-01-06 at 14.58.19.png

Simple business questions such as “What is the total profit gained between 2015 and 2016?”, “What is the mean age of customers per city?” can be answered using the following queries;

SELECT SUM(profit) FROM sales WHERE order_date BETWEEN '2015-01-01' and '2016-01-01';
SELECT city, AVG(age) FROM customer GROUP BY city ORDER BY AVG(age);

(It is safe to say that these are simple business questions in this context because their answers are returned from querying just one table in the database)

However, complex questions such as “What is the total profit gotten from each category of product?”, “What is the total revenue earned from each segment of customer?”, etc., would require a different approach since the data needed to answer these questions are not present in a single table. This leads to the focus of this writing, JOINS.

What are Joins?

Joins are used to create a single table out of multiple tables. They do this by combining rows from separate tables based on a similar column present in the separate tables. It is essential to state that the tables created from Joins are logical and not physical tables.

The tables must share a similar column for a join to be created. This column provides a basis for the joining of the tables.

According to ANSI-Standard, there are five types of Joins.

  • Inner Join
  • Left join
  • Right Join
  • Full Outer Join
  • Cross Join

To understand the different join types, consider the three tables below.

Students Table - contains bio-data of students in a DS class. Students.png

Statistics_students Table - contains scores of students offering the Statistics course in the DS class Statistics.png

Python_students Table - contains scores of students offering the Python course in the DS class.

Python.png

The examples would require that you understand aliasing. If you don't you can read up here and come back to continue.

Inner Join

In simple terms, an inner-join is used to create a new table by combining rows from different tables with the same value in a specified field(column). In other words, it joins rows where the specified field matches. It is similar to the intersection operation in set theory.

To retrieve the bio-data of all statistics students, we use Inner Join with this syntax.

/*In this example and the following ones we join the tables on the Student_ID field since it is a common field in each table*/
SELECT a.*,
b.Student_score
FROM Students as a
INNER JOIN Statistics_students as b
ON a.Student_ID=b.Student_ID;

The resulting table is shown below.

Screenshot 2022-01-27 at 15.20.21.png

Right Join

Right Join creates a table containing all records in the right table joined with the left table where the specified field is matched.

To retrieve the scores of students who offer statistics only or both statistics and python using the right join, we run the query below.

SELECT a.*,
b.Python_score
FROM Python_students as b
RIGHT JOIN Statistics_students as a
ON b.Student_ID = b.Student_ID;

The resulting table is shown below.

Screenshot 2022-01-27 at 15.16.01.png

Left Join

Left Join creates a table containing all records in the left table joined with records in the right table where the specified field is matched.

To retrieve the scores of students who offer statistics only or both statistics and python using left join, we run the query below.

SELECT a.*,
b.Python_score
FROM Statistics_students as a
LEFT JOIN Python_students as b
ON b.Student_ID = b.Student_ID;

The resulting table is shown below.

Screenshot 2022-01-27 at 15.14.01.png

Full Outer Join

The Full Outer Join can be viewed as the combination of both the right and left join. It combines all rows from the separate tables, both where the specified column matches and is not matched. It is similar to the Union operation in set theory.

To retrieve a full outer join on the statistics and python table, we run the query below;

SELECT a.*,
b.Python_score
FROM statistics_students as a
FULL OUTER JOIN python_students as b
ON a.student_ID = b.Student_ID;

The resulting table is shown below.

Screenshot 2022-01-27 at 15.12.21.png

Cross Join

The Cross Join is the least popular type of Join since its use cases aren't as prevalent as the other types. The Cross Join between two tables A and B produces a table C where the number of records is equal to the number of records in table A multiplied by the number of records in table B.

Consider the employee and departments table below;

Cross Join1.png

A cross join between these two tables is gotten from running the query below;

SELECT Employee.*, Department.* FROM Employee, Department;

Or

SELECT Employee.*,
Department.*
FROM Employee
CROSS JOIN Department;

The resulting table will have six rows since the employee table has three rows, and the department table has two rows (3x2 = 6)

Cross Join2.png

The Cross Join is also known as the Cartesian product.

Bonus Tips

  • Subqueries can also be used in some cases to achieve the same results as Joins.

  • A clear understanding of what Join type and when to apply it is more important than cramming the syntax (you can always look up the syntax for Joins online)

  • Using select * when using Joins (and generally) makes your query inefficient. Select only the fields required for the task at hand. If you're interested in knowing why you can read this article.

I sincerely hope that you enjoyed reading this writing. Feel free to leave a comment or question. Gracias!