1.1 Relational Databases

Ron McFadyen

1. RELATIONAL DATABASES AND MS ACCESS

1.1 Relational Databases

Relational Databases were introduced by E. F. Codd in 1969¹; Codd’s 1970 paper² is considered one of the great papers in Computer Science.

We begin with a very small example: a database with one relation, the list of employees shown in figure 1.1. You should notice this looks just like a two-dimensional table of rows and columns. The name of the table is Employees, each column of the table has its own title, and each row has the same structure. Each row has a value for employee number, first name, last name, and gender. As tables of data appear in so many places (newspaper articles, textbooks, web pages, etc.) it is very likely you have seen and used this representation for data previously.

Employee ID	First Name	Last Name	Gender
123	Joe	Smith	Male
333	Jim	Jones	Male
456	April	Smith	Female
842	Jenny	Jones	Female
777	Tom	Lee	Male

Figure 1.1: A list of employees

Let us assume the Employees table in figure 1.1 has one row for each employee who works for some hypothetical company. Data kept for each employee comprises their employee identification number, their first and last names, and their gender. Information structured in tables is very concise; at a glance we can obtain useful information.

According to the database design methodology in Information Modeling and Relational Databases³, a database designer must be able to express structured information as verbalizations. A verbalization that fits the information in one row of the Employees table is:

- - Employee with ID … has a first name …, a last name …, and is of … gender

In verbalizations like this the ellipses are placeholders: we can use values from a single row to create complete statements that explain the meaning of a row. For example,

- - Employee with ID 123 has a first name Joe, a last name Smith, and is of Male gender
  - Employee with ID 333 has a first name Jim, a last name Jones, and is of Male gender

A similar approach to organizing knowledge about data appears in the literature on literacy. In the Journal of Reading several articles by Kirsch and Mosenthal discuss the organization of information and its conceptualization as document sentences. In Building Documents by Combining Simple Lists⁴, Kirsch and Mosenthal present an example based on information from The World Almanac and Book of Facts: 1980 (Newspaper Enterprise Association, p. 427). That data is reproduced in figure 1.2.

Magazines	Circulation
TV Guide	19,547,763
Reader’s Digest	18,094,192
National Geographic	10, 249,748
Better Homes & Gardens	8,007,202
Family Circle	7,611,578
Woman ‘s Day	7,535,855
McCall’s	6,502,880

Figure 1.2: Circulation of leading U.S. magazines

A major point the authors make is that such information can be re-conceptualized as a series of simple document sentences formed from a basic document sentence. This document sentence expresses an understanding of the tabular data in natural language.

The document sentence for figure 1.2 is:

- - Magazine X has a circulation of Y.

Kirsch and Mosenthal use variables (X and Y) to stand for data that comes from a table. Taking values from a row, we plug values for X and Y into the document sentence to obtain sentence instantiations:

- - TV Guide has a circulation of 19,547,763.
  - Reader’s Digest has a circulation of 18,094,192.
  - National Geographic has a circulation of 10,249,748.
  - Better Homes & Gardens has a circulation of 8,007,202.
  - Family Circle has a circulation of 7,611,578.
  - Woman’s Day has a circulation of 7,535,855.
  - McCall’s has a circulation of 6,502,880.

Document sentences and verbalization sentences are essentially the same. Both sentences use natural language to express in words the meaning of tabular data. Whether one is designing databases or reading structured information, it can be useful for understanding to re-formulate data as statements in natural language.

Let us be a bit formal for a moment. Commercial relational database systems are systems where data is organized into relations. Figure 1.3 shows the general structure of a relation. We say a relation comprises a set of tuples where each tuple has the same number of attribute values, where each attribute value is taken from some corresponding domain, and where a domain represents a set of valid values for an attribute.

**Figure 1.3: General structure of a relation**

The Employees table in figure 1.1 can be considered a relation of 5 tuples where each tuple has 4 values drawn from each of the employee identifier, first name, last name, and gender domains. Similarly, we can say the lists comprising the Circulation of leading U.S. Magazines in figure 1.2 can be considered a relation with 7 tuples each having 2 attribute values.

Relations are typically implemented in commercial databases as tabular structures comprising rows and a fixed number of columns. Everybody is familiar with tables as they are commonplace in textbooks, papers, magazines, etc. This simplicity of representation is one reason why relational databases have been very successful as repositories for important data.

Exercises

To design a database, a database engineer needs to find good representations of how an organization uses data. Good sources include input forms, reports, web pages, etc. A challenge for database designers is to find these sources and interpret them.

1) Consider the following table of product information sold by ABC Foods. Verbalize the information presented.

Product ID	Product Name	Unit Price	Units In–Stock
1	Black Tea	$2.00	44
2	Green Tea	$3.00	33
3	Vegetarian Lasagne	$10.00	20
4	Cajun Seasoning	$11.00	29
5	Cranberry Sauce	$21.00	0

2) Suppose the following input form is used to enter contact information. Verbalize the information that is being collected:

3) Consider the following report that the Human Resources department of ABC Foods must produce. Verbalize the information in that report.

Employee ID	First Name	Last Name	Department
1	John	Smith	Receiving
2	Lee	Daniels	Sales
3	April	Turner	Sales
4	Thomas	Trump	Marketing
5	Lee	Smith	Marketing

1 Derivability, Redundancy, and Consistency of Relations Stored in Large Data Banks; IBM Research Report, 1969.

2 A Relational Model of Data for Large Shared Data Banks; CACM 13, No. 6, June 1970.

3 Information modeling and relational databases, 2nd edition; Terry Halpin and Tony Morgan; Morgan Kaufmann Publishers; ISBN -13 978-0-12-373568-3.

4 Building documents by combining simple lists; Irwin S. Kirsch and Peter B. Mosenthal; Journal of Reading, Vol. 33, No. 2, pp. 132-134.

License

Icon for the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License

Relational Databases and Microsoft Access 365 Copyright © by Ron McFadyn is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License, except where otherwise noted.

License

Share This Book