1
|
First normal form (1NF or Minimal Form) is a normal form used in database normalization. A relational database table that adheres to 1NF is one that meets a certain minimum set of criteria. These criteria are basically concerned with ensuring that the table is a faithful representation of a relation"[T]he overriding requirement, to the effect that the table must directly and faithfully represent a relation, follows from the fact that 1NF was originally defined as a property of relations, not tables." Date, C.J. "What First Normal Form Really Means" in Date on Database: Writings 2000-2006 (Springer-Verlag, 2006), p. 128. and that it is free of repeating groups"First normal form excludes variable repeating fields and groups." Kent, William. "A Simple Guide to Five Normal Forms in Relational Database Theory", Communications of the ACM 26 (2), Feb. 1983, pp. 120-125..
The concept of a "repeating group" is, however, understood in different ways by different theorists. As a consequence, there is not universal agreement as to which features would disqualify a table from being in 1NF. Most notably, 1NF as defined by some authors (for example, Ramez Elmasri and Shamkant B. NavatheElmasri, Ramez and Navathe, Shamkant B. Fundamentals of Database Systems, Fourth Edition (Addison-Wesley, 2003), p. 315., following the precedent established by E.F. Codd) excludes relation-valued attributes (tables within tables); whereas 1NF as defined by other authors (for example, Chris Date) permits them.
Contents |
According to Date\'s definition of 1NF, a table is in 1NF if and only if it is "isomorphic to some relation", which means, specifically, that it satisfies the following five conditions:
{{{1}}}
Violation of any of these conditions would mean that the table is not strictly relational, and therefore that it is not in 1NF.
Examples of tables (or views) that would not meet this definition of 1NF are:
Date\'s fourth condition, which expresses "what most people think of as the defining feature of 1NF"Date, C.J. "What First Normal Form Really Means" p. 128., is concerned with repeating groups. The following example illustrates how a database design might incorporate repeating groups, in violation of 1NF.
Suppose a novice designer wishes to record the names and telephone numbers of customers. He defines a customer table which looks like this:
| Customer ID | First Name | Surname | Telephone Number |
|---|---|---|---|
| 123 | Rachel | Ingram | 555-861-2025 |
| 456 | James | Wright | 555-403-1659 |
| 789 | Maria | Fernandez | 555-808-9633 |
The designer then becomes aware of a requirement to record multiple telephone numbers for some customers. He reasons that the simplest way of doing this is to allow the "Telephone Number" field in any given record to contain more than one value:
| Customer ID | First Name | Surname | Telephone Number |
|---|---|---|---|
| 123 | Rachel | Ingram | 555-861-2025 |
| 456 | James | Wright | 555-403-1659 555-776-4100 |
| 789 | Maria | Fernandez | 555-808-9633 |
Assuming, however, that the Telephone Number column is defined on some Telephone Number-like domain (e.g. the domain of strings 12 characters in length), the representation above is not in 1NF. 1NF (and, for that matter, the RDBMS) prohibits a field from containing more than one value from its column\'s domain.
The designer might attempt to get around this restriction by defining multiple Telephone Number columns:
| Customer ID | First Name | Surname | Tel. No. 1 | Tel. No. 2 | Tel. No. 3 |
|---|---|---|---|---|---|
| 123 | Rachel | Ingram | 555-861-2025 | ||
| 456 | James | Wright | 555-403-1659 | 555-776-4100 | |
| 789 | Maria | Fernandez | 555-808-9633 |
This representation, however, makes use of nullable columns, and therefore does not conform to Date\'s definition of 1NF. Even if the view is taken that nullable columns are allowed, the design is not in keeping with the spirit of 1NF. Tel. No. 1, Tel. No. 2., and Tel. No. 3. share exactly the same domain and exactly the same meaning; the splitting of Telephone Number into three headings is artificial and causes logical problems. These problems include:
The designer might, alternatively, retain the single Telephone Number column but alter its domain, making it a string of sufficient length to accommodate multiple telephone numbers:
| Customer ID | First Name | Surname | Telephone Number |
|---|---|---|---|
| 123 | Rachel | Ingram | 555-861-2025 |
| 456 | James | Wright | 555-403-1659, 555-776-4100 |
| 789 | Maria | Fernandez | 555-808-9633 |
This design is not consistent with 1NF, and presents several design issues. The Telephone Number heading becomes semantically woolly, as it can now represent either a telephone number, a list of telephone numbers, or indeed anything at all. A query such as "Which pairs of customers share a telephone number?" is arguably more difficult to formulate, given the necessity to cater for lists of telephone numbers as well as individual telephone numbers. Meaningful constraints on telephone numbers are also very difficult define in the RDBMS with this design.
A design that is unambiguously in 1NF makes use of two tables: a Customer table and a Customer Telephone Number table.
| Customer ID | First Name | Surname |
|---|---|---|
| 123 | Rachel | Ingram |
| 456 | James | Wright |
| 789 | Maria | Fernandez |
| Customer ID | Telephone Number |
|---|---|
| 123 | 555-861-2025 |
| 456 | 555-403-1659 |
| 456 | 555-776-4100 |
| 789 | 555-808-9633 |
Repeating groups of telephone numbers do not occur in this design. Instead, each Customer-to-Telephone Number link appears on its own record.
Some definitions of 1NF, most notably that of E.F. Codd, make reference to the concept of atomicity. Codd states that the "values in the domains on which each relation is defined are required to be atomic with respect to the DBMS."Codd, E.F. The Relational Model for Database Management Version 2 (Addison-Wesley, 1990). Codd defines an atomic value as one that "cannot be decomposed into smaller pieces by the DBMS (excluding certain special functions)."Codd, E.F. The Relational Model for Database Management Version 2 (Addison-Wesley, 1990), p. 6.
Hugh Darwen and Chris Date have suggested that Codd\'s concept of an "atomic value" is ambiguous, and that this ambiguity has led to widespread confusion about how 1NF should be understood.Darwen, Hugh. "Relation-Valued Attributes; or, Will the Real First Normal Form Please Stand Up?", in C. J. Date and Hugh Darwen, Relational Database Writings 1989-1991 (Addison-Wesley, 1992)."[F]or many years," writes Date, "I was as confused as anyone else. What\'s worse, I did my best (worst?) to spread that confusion through my writings, seminars, and other presentations." Date, C.J. "What First Normal Form Really Means" in Date on Database: Writings 2000-2006 (Springer-Verlag, 2006), p. 108 In particular, the notion of a "value that cannot be decomposed" is problematic, as it would seem to imply that few, if any, data types are atomic:
Date suggests that "the notion of atomicity has no absolute meaning"Date, C.J. "What First Normal Form Really Means" p. 112.: a value may be considered atomic for some purposes, but may be considered an assemblage of more basic elements for other purposes. If this position is accepted, 1NF cannot be defined with reference to atomicity. Columns of any conceivable data type (from string types and numeric types to array types and table types) are then acceptable in a 1NF table—although perhaps not always desirable. Date argues that relation-valued attributes, by means of which a field within a table can contain a table, are useful in rare cases.Date, C.J. "What First Normal Form Really Means" pp. 121-126.
| Topics in Database normalization |
|
First normal form | Second normal form | Third normal form
|
This article is licensed under the GNU Free Documentation License. It uses material from Wikipedia