JOIN Explanation



index
Disabled back button Next Section
printable version

Section 0: Module Objectives or Competencies
Course Objective or Competency Module Objectives or Competency
The student will be able to apply various types of SQL JOIN operations to combine related rows from two tables into single virtual tables. The student will also be able to explain why the JOIN is the power behind a relational database. The student will be able to apply an INNER JOIN using the WHERE clause.
The student will be able to apply an INNER JOIN using the INNER JOIN ON keywords.
Section 1: How to JOIN

An earlier set of course notes stated that "JOIN is the power behind a relational database."

If you are trying to extract data from multiple tables, then you must specify, for SQL, how those tables are joined.

This explanation will use the Northwinds database seen in the SQL Exercise.

Here is the ERD for Northwinds:

ER Diagram for Northwinds.

Here is the associated list of entities and attributes:




The tables are linked as follows:




A JOIN must specify the linkages between all the tables that provide data for your query. If two tables are not connected directly, you have to include all tables between the two, and specify their linkages.

For example, to list what products were purchased by each customer, you need CompanyName from Customers and ProductName from Products.

SELECT Customers.CompanyName, Products.ProductName

But look at the relevant portion of the ERD below – Customers and Products are not directly connected.

Excerpt from Northwinds ER Diagram.

To get from Customers to Products you must go from Customers to Orders to Order Details to Product. Hence, the FROM portion of your query must list Customers, Orders, Order Details, and Products:

FROM Customers, Orders, `Order Details`, Products

Next, you must specify how those tables are linked.

Using WHERE clauses, that could be written as

WHERE Customers.CustomerID = Orders.CustomerID
AND Orders.OrderID = `Order Details`.OrderID
AND `Order Details`.ProductID = Products.ProductID

Put all the pieces together and you end up with the following query:

SELECT Customers.CompanyName, Products.ProductName
FROM Customers, Orders, `Order Details`, Products
WHERE Customers.CustomerID = Orders.CustomerID
AND Orders.OrderID = `Order Details`.OrderID
AND `Order Details`.ProductID = Products.ProductID;

Here is the INNER JOIN version with the ON clause:

SELECT Customers.CompanyName, Products.ProductName
FROM Customers
INNER JOIN Orders ON (Customers.CustomerID = Orders.CustomerID)
INNER JOIN `Order Details` ON (Orders.OrderID = `Order Details`.OrderID))
INNER JOIN Products ON (`Order Details`.ProductID = Products.ProductID);

Here is the INNER JOIN version with the USING clause:

SELECT Customers.CompanyName, Products.ProductName
FROM Customers
INNER JOIN Orders USING(CustomerID)
INNER JOIN `Order Details` USING(OrderID)
INNER JOIN Products USING(ProductID);

Remember, you can only use the USING clause when you are joining table1 and table2 on a column with the same name in both tables

Section 2: JOIN Example 2

This example comes from a version of the project case, Wind Riders of the Lost River Range.

Here is a pretty common query that could form the basis for more complex queries, but let's start with something simple.

Query: Generate a list of all customer returns. including where each defective item was ultimately shipped. Include customer number, customer name, model number, model description, model manufacturer, and discount retailer name.




Where do we get the data required to solve this query?

While it might appear at first glance that the query involves only three tables – customer, productCategory, and discountRetailer – an examination of the entity-relationship diagram shows that more tables are involved. The portion of the entity-relationship diagram that we need to focus on appears below:

Excerpt from ER Diagram for WRLRR returns.


Let's focus only on the entities involved in the query:

More specific excerpt from ER Diagram for WRLRR returns.

As you can see, to get from customer to productCategory (or to discountRetailer) involves going through links in the returns table, the returnItems table, and the itemPurchased table.




Let’s list the tables involved individually:

Let’s look at the attribute list for each entity:

While the colors above indicate how tables are joined, they are also summarized inthe following table:

Table 1 Table 2 Linking Attribute
customer returns customerNum
returns returnItem returnNum
returnItem itemPurchased equipmentSerialNum
itemPurchased productCategory modelNum
itemPurchased discountRetailer retailerNum



To write the query, we begin with SELECT and the attributes that we want included in the result. Remember that if any attribute name appears in more than one table, we must specify the table name as well.

If first and last name are separate, we can join them with CONCAT. Here we want customer number, customer name, model number, model description, model manufacturer, and discount retailer.

SELECT customer.customerNum, CONCAT(customerFirstName, ' ', customerLastName) AS `Name`,
productCategory.modelNum, modelDescription, modelManufacturer, retailerName

We next indicate the tables from which the data will come:

FROM customer, returns, returnitem, itempurchased, productcategory, discountretailer

Then we specify how the tables are linked together, as shown in the table above.

WHERE customer.customerNum = returns.customerNum
AND returns.returnNum = returnItem.returnNum
AND returnItem.equipmentSerialNum = itempurchased.equipmentSerialNum
AND itempurchased.modelNum = productCategory.modelNum
AND itempurchased.retailerNum = discountretailer.retailerNum

Note that while I prefer to list the linkages in the order in which they are encountered (table 1 to table 2, table 2 to table 3, table 3 to table 4, etc.), the order is irrelevant. It simply makes it easier for me to be sure I do not omit any links.

The final query looks like this:

SELECT customer.customerNum, CONCAT(customerFirstName, ' ', customerLastName) AS `Name`,
productCategory.modelNum, modelDescription, modelManufacturer, retailerName
FROM customer, returns, returnitem, itempurchased, productcategory, discountretailer
WHERE customer.customerNum = returns.customerNum
AND returns.returnNum = returnItem.returnNum
AND returnItem.equipmentSerialNum = itempurchased.equipmentSerialNum
AND itempurchased.modelNum = productCategory.modelNum
AND itempurchased.retailerNum = discountretailer.retailerNum;

The query could also be written using an INNER JOIN with an ON clause.

SELECT customer.customerNum, CONCAT(customerFirstName, ' ', customerLastName) AS `Name`,
productCategory.modelNum, modelDescription, modelManufacturer, retailerName
FROM customer
INNER JOIN returns ON customer.customerNum = returns.customerNum
INNER JOIN returnitem ON returns.returnNum = returnItem.returnNum
INNER JOIN itempurchased ON returnItem.equipmentSerialNum = itempurchased.equipmentSerialNum
INNER JOIN productcategory ON itempurchased.modelNum = productCategory.modelNum
INNER JOIN discountretailer ON itempurchased.retailerNum = discountretailer.retailerNum;

Here it is with an INNER JOIN and a USING clause.

SELECT customer.customerNum, CONCAT(customerFirstName, ' ', customerLastName) AS `Name`,
productCategory.modelNum, modelDescription, modelManufacturer, retailerName
FROM customer
INNER JOIN returns USING(customerNum)
INNER JOIN returnitem USING(returnNum)
INNER JOIN itempurchased USING(equipmentSerialNum)
INNER JOIN productcategory USING(modelNum)
INNER JOIN discountretailer USING(retailerNum);

Remember, you can only use the USING clause when you are joining table1 and table2 on a column with the same name in both tables




No matter which JOIN approach you prefer, you have to be sure to include all of the necessary tables, and you have to specify the links between each of the tables. Understanding JOINs is the key to writing queries.