# Relational algebra operations

Operations in the Relational Data Model are defined by Relational Algebra. Relational Algebra is not based on the attributes, but rather on the order of values (components) in the tuple. There are five basic operations [7]:

1. Union,
2. Difference,
3. Cartesian Product,
4. Projection,
5. Selection.

The Union of relations and is defined as:

 (3.2)

It is the set of tuples that are in or . This operation may be applied to the relations of the same arity only, so all the tuples in the result have the same number of components (values).

The Difference of relations and , defined as:

 (3.3)

It is the set of tuples which are in but not in . It is required that and have the same arity.

The Cartesian Product of relations and is defined as:

 (3.4)

Assume that has arity and has arity . is defined as the set of all possible ()-tuples which first components form a tuple in and last form a tuple in . The result has the arity equal to .

The Projection is denoted as . Assuming that there is a relation of arity , then:

 (3.5)

It denotes the projection of onto components , where are distinct integers in range . That is the set of -tuples such that there is some -tuple in for which for .

For example is computed by taking each tuple from , and forming a 2-tuple set from fourth and second component of . Instead of using locations, attribute names can be used as well. Assuming that is the relation schema for the above relation. Then the above projection can also be denoted as . The resulting relation can be described by the following relation schema: .

The Selection is denoted as:

 (3.6)

Where is a relation and is a formula involving:
• Operands that are constants or component numbers; component is represented by ,
• The arithmetic comparison operators , , , , , ,
• The logical operators (and), (or), (not).
Then is the set of tuples in such that, when for all the -th component of is substituted for any occurrences of in formula , the formula becomes true. As for projection if a relation has named columns, then the formula in the selection can refer to columns by name instead of by number. The arity of is the same as the arity of .

There are also some other algebraic operations defined as: Intersection, Quotient, Join, Natural Join or Semijoin. They are all derived from the basic ones.

The Intersection can be defined as:

 (3.7)

Thus its functionality is provided by difference, that is why it is not considered to be recognized as a basic operation.

The Quotient of relations and is denoted as . Assuming that and are relations of arity and respectively, and , and . Then the quotient is a set of all ()-tuples , such that for all -tuples in , the tuple is in :

 (3.8)

The Join (also known as -Join) of and on columns and is defined as:

 (3.9)

Where is a comparison operator which complies with the definition 3.6, and is the arity of . In other words, these are all the tuples in the Cartesian product of and such that the -th component of stands in relation to the th component of . If is '' then the operation is called the .

The Natural Join is a special case of the Join. It is applicable only when both and have columns that are named by attributes. Assuming that are all the attribute names used for both and , then the Natural Join is defined as:

 (3.10)

such that is the list of all components of , in order, except the components .

The Semijoin of relation by relation , written , is the projection onto the attributes of of the natural join of and :

 (3.11)

where stands for the list of attributes of .

Relational Algebra becomes a natural way to formulate queries. Having selection and projection, a question about a single relation can be asked. Adding Cartesian product, the query can concern more than one relation. Applying union or difference, several queries can be combined, to form a more sophisticated one.

Igor Wojnicki 2005-11-07