Relational Algebra



index
Disabled back button Next Section
printable version

Section 0: Module Objectives or Competencies
Course Objective or Competency Module Objectives or Competency
The student will be able to explain and apply the relational algebra operators and explain how they provide the theoretical foundation for relational databases and SQL. The student will be able to explain and apply the relational algebra set operators.
The student will be able to explain and apply the specialized relational algebra relational database operators.
Section 1: Justification

Overview

Relational algebra is very important for several reasons:

Justification

The core of the relational data model is relational algebra, a formal algebra for manipulating data in that model.

Its operations include two groups:

Basis of SQL

SQL is essentially built on relational algebra. The SQL parser translates its Abstract Syntax Tree into a tree of Relational Operators that describe relational algebra operations like filtering, cartesian products, joining, set operations, sorting, expressions and projections.

These operators are further transformed according to the laws of relational algebra and optimized into physical operators that actually execute the query, that is, fetch the data from the storage engine and execute the necessary transformations against rows or columns. The declarative nature of SQL comes from its close relationship with relational algebra and the ability to map SQL clauses almost directly to relational algebra operations.

Here is an additional reference on Relational Algebra and SQL.

Section 2: Operators

Relational algebra defines the theoretical way of manipulating table contents using the following relational functions:

The first three operations (UNION, INTERSECT, and DIFFERENCE) require that the relations to which they are applied have the same type of tuples, i.e., each table must have exactly the same columns, and the domains of those columns (the set of values that they can assume) are identical.

Section 3: UNION

Here is a Venn diagram of a Union operation:

Venn diagram of UNION operation.

The result of the UNION operation is a new relation that includes all tuples from two tables.

Example of UNION operation.
Section 4: INTERSECT

Here is a Venn diagram of an Intersect operation:

Venn diagram of INTERSECT operation.The relations must be union compatible. Example of INTERSECT operation.
Section 5: DIFFERENCE

Here is a Venn diagram of an Difference operation:

Venn diagram of DIFFERENCE operation.The relations must be union compatible. Example of DIFFERENCE operation.
Section 6: SELECT

The SELECT operation can be used to list all of the row's values for each attribute found in a table, or it can select a subset of the tuples in a relation that satisfy a selection condition.

Example of SELECT operation.
Section 7: PROJECT

The results of the PROJECT operation is a list of all values for selected attributes.

Example of PROJECT operation.
Section 8: PRODUCT

The result of PRODUCT is a relation containing a combined set of tuples from two relations, i.e., all possible pairs of rows from the two tables.

Example of PRODUCT operation.
Section 9: JOIN

The JOIN (or natural inner JOIN) operation is used to combine related tuples from two relations into single tuples.

Tables used for JOIN operation.

A natural JOIN links tables by selecting only the rows with common values in their common attribute.

A natural JOIN involves three steps: PRODUCT, SELECT, PROJECT).

  1. PRODUCT:
    Results of PRODUCT operation.
  2. SELECT yields only the rows for which the AGENT_CODE values are equal:
    Results of SELECT operation.
  3. PROJECT eliminates duplicate columns:
    Results of PROJECT operation.

Natural JOIN notes:




Another type of inner JOIN, the equiJOIN links tables based on an equality condition that compares specified attributes of each relation.

Results of equiJoin operation.



An outer JOIN retains unmatched pairs and the unmatched values would be left null.

An outer JOIN is basically an inner JOIN, but also returns extra rows padded with nulls when the JOIN condition is not satisfied.

Section 10: DIVIDE

The DIVIDE operator is used to answer questions about one set of data being associated with ALL values of data in another set.

DIVIDE requires the use of one two-column table as the dividend and one single column table as the divisor.

In the example below, a DIVIDE operation is used to determine which of the customers, if any, purchased every product listed in table 2.

Example of DIVIDE operation.

Note the following:




Example 2: Locating all retailers that sell a particular subset of Big Sky Brewing's product line.

Example of DIVIDE operation.