The Relational Model

The Relational model was introduced for the first time as ``A Relational Model of Data for Large Shared Data Banks'' in 1970 [2]. Perhaps one of the most important reasons making the model so popular, is the way it supports powerful, yet simple declarative languages. These languages express operations which can be applied to data [22]. Database systems which are based on the Relational Model are called the Relational Database Management Systems (RDBMS).

The Relational Model has a very strong yet simple, mathematical basis [6]. A relation, from mathematical point of view, is a subset of the Cartesian product of a list of domains. A domain is a set of values. The Cartesian product of domains: $D_1,D_2,\ldots,D_n$, denoted as: $D_1\times D_2 \times D_3 \times \ldots \times D_n$ is an ordered set of all n-tuples: $\langle v_1,v_2,\ldots,v_n \rangle$ where $v_i$ is in $D_i$ for $i = 1,2,\ldots, n$. Subsets of the Cartesian product are relations, where $n$ is called the arity of the relation.

Relations can can be visualized as tables. Each row is a tuple and each column corresponds to one domain (component). The relational model represents a database as a collection of relations [2,6,22,23].

There is an extended formulation of Relations. Each relation is characterized by its name and a list of named attributes, extending the mathematical basis. An ordered set of attribute values for a certain relation is the tuple. A value of the attribute is determined by the attribute domain. A domain is a set of atomic, or in other words indivisible, values. The domain can be perceived as a data type for the attribute. A relation schema is defined as a relation name $R$ and a list of attributes $A_i$:


\begin{displaymath}
R(A_1,A_2,\ldots,A_n).
\end{displaymath} (3.1)

The domain $D_i=dom(A_i)$ is defined for each attribute. A relation state (or just relation) of the relation schema $R(A_1,A_2,\ldots,A_n)$, denoted by $r(R)$, is a set of tuples ${t_1,\ldots,t_m}$. Each tuple is an ordered list of $n$ values $t=\langle v_1,\ldots,v_n \rangle$, and each value $v_i$ is an element of $dom(A_i)$ or $null$ which means does not exist or unknown. The values in tuple $t$ are referred to as $t[A_i]$, so by the attribute name. Relations represent facts about entities or facts about relationships between relations. A relational database schema is a set of relation schemas: $S={R_1,\ldots,R_p}$.

In order to identify a tuple in the relation there is a need for a key [22]. The key is a set of attributes that allows to distinguish tuples. A set $S$ of attributes of a relation $R$ is the key if:

  1. No instance of $R$ that represents a possible state can have two tuples that agree in all attributes of $S$.
  2. No proper subset of $S$ has property 1.

The keyness depends on the schema, not the current instance of the relation [22]. A relation can have more than one key, however it is useful to select one of them and to be consistent, regarding it as the only key. Such a key is called a primary key (others are usually called candidate keys). The choice which key becomes the primary one is made in the design stage.

Igor Wojnicki 2005-11-07