Relational algebra operations

Operations in the Relational Data Model are defined by Relational Algebra. Relational Algebra is not based on the attributes, but rather on the order of values (components) in the tuple. There are five basic operations [7]:

Union,
Difference,
Cartesian Product,
Projection,
Selection.

The Union of relations and is defined as:

$\begin{displaymath} R \cup S = \{ \langle x_1,\ldots,x_n \rangle : \langle x_1,\... ..._n \rangle \in R \vee \langle x_1,\ldots,x_n \rangle \in S \}. \end{displaymath}$

(3.2)

It is the set of tuples that are in

. This operation may be applied to the relations of the same arity only, so all the tuples in the result have the same number of components (values).

The Difference of relations and , defined as:

$\begin{displaymath} R - S = \{ \langle x_1,\ldots,x_n \rangle \in R : \langle x_1,\ldots,x_n\rangle \notin S \}. \end{displaymath}$

(3.3)

It is the set of tuples which are in

but not in

. It is required that

and

have the same arity.

The Cartesian Product of relations and is defined as:

$\begin{displaymath} R \times S = \{ \langle x_1,\ldots,x_n,y_1,\ldots,y_m \rang... ... \rangle \in R \wedge \langle y_1,\ldots,y_m \rangle \in S \}. \end{displaymath}$

(3.4)

Assume that

has arity

and

has arity

. $R \times S$ is defined as the set of all possible (

)-tuples which first

components form a tuple in

and last

form a tuple in

. The result has the arity equal to

The Projection is denoted as $\pi$ . Assuming that there is a relation of arity , then:

$\begin{displaymath} \pi_{i_1,i_2,\ldots,i_m}(R). \end{displaymath}$

(3.5)

It denotes the projection of

onto components $i_1,i_2,\ldots,i_m$ , where

are distinct integers in range $1,\ldots,k$ . That is the set of

-tuples $\langle a_1,a_2,\ldots,a_m \rangle$ such that there is some

-tuple $\langle b_1,b_2,\ldots,b_k \rangle$ in

for which $a_j=b_{i_j}$ for $j=1,2,\ldots,m$ .

For example $\pi_{4,2}(R)$ is computed by taking each tuple from , and forming a 2-tuple set from fourth and second component of . Instead of using locations, attribute names can be used as well. Assuming that is the relation schema for the above relation. Then the above projection can also be denoted as $\pi_{D,B}(R)$ . The resulting relation can be described by the following relation schema: .

The Selection is denoted as:

$\begin{displaymath} \sigma_F(R). \end{displaymath}$

(3.6)

Where

is a relation and

is a formula involving:

Operands that are constants or component numbers; component is represented by $\$i$ ,
The arithmetic comparison operators , , , $\leq$ , $\neq$ , $\geq$ ,
The logical operators $\wedge$ (and), $\vee$ (or), $\neg$ (not).

Then $\sigma_F(R)$ is the set of tuples

such that, when for all

the

-th component of

is substituted for any occurrences of $\$i$ in formula

, the formula

becomes true. As for projection if a relation has named columns, then the formula in the selection can refer to columns by name instead of by number. The arity of $\sigma_F(R)$ is the same as the arity of

There are also some other algebraic operations defined as: Intersection, Quotient, Join, Natural Join or Semijoin. They are all derived from the basic ones.

The Intersection can be defined as:

$\begin{displaymath} R \cap S = R - (R - S). \end{displaymath}$

(3.7)

Thus its functionality is provided by difference, that is why it is not considered to be recognized as a basic operation.

The Quotient of relations and is denoted as $R \div S$ . Assuming that and are relations of arity and respectively, and , and $S \neq \emptyset$ . Then the quotient is a set of all ()-tuples $\langle a_1,a_2,\ldots,a_{r-s} \rangle$ , such that for all -tuples $\langle a_{r-s+1},\ldots,a_r \rangle$ in , the tuple $\langle a_1,a_2,\ldots,a_r \rangle$ is in :

$\begin{displaymath} R \div S = \pi_{1,2,\ldots,r-s}(R) - \pi_{1,2,\ldots,r-s}((\pi_{1,2,\ldots,r-s}(R) \times S) - R). \end{displaymath}$

(3.8)

The Join (also known as $\theta$ -Join) of and on columns and is defined as:

$\begin{displaymath} R \Join_{i \theta j} S = \sigma_{\$i \theta \$(r+j)}(R \times S). \end{displaymath}$

(3.9)

Where $\theta$ is a comparison operator which complies with the definition 3.6, and

is the arity of

. In other words, these are all the tuples in the Cartesian product of

and

such that the

-th component of

stands in relation $\theta$ to the

th component of

. If $\theta$ is '

' then the operation is called the

The Natural Join is a special case of the Join. It is applicable only when both and have columns that are named by attributes. Assuming that $\langle A_1,A_2,\ldots,A_k \rangle$ are all the attribute names used for both and , then the Natural Join is defined as:

$\begin{displaymath} R \Join S = \pi_{i_1,i_2,\ldots,i_m} \sigma_{R.A_1 = S.A_1 \wedge \ldots \wedge R.A_k=S.A_k}(R \times S). \end{displaymath}$

(3.10)

such that $i_1,\ldots,i_m$ is the list of all components of $R \times S$ , in order, except the components $S.A_1,\ldots,S.A_k$ .

The Semijoin of relation by relation , written $R \,\rhd\!\!\!\!< S$ , is the projection onto the attributes of of the natural join of and :

$\begin{displaymath} R \,\rhd\!\!\!<S = \pi_R(R \Join S) \end{displaymath}$

(3.11)

where

stands for the list of attributes of

Relational Algebra becomes a natural way to formulate queries. Having selection and projection, a question about a single relation can be asked. Adding Cartesian product, the query can concern more than one relation. Applying union or difference, several queries can be combined, to form a more sophisticated one.

Igor Wojnicki 2005-11-07