Database Fundamentals/Normalization

This lesson introduces normalization.

Objectives and Skills
Objectives and skills for the normalization portion of Microsoft Exam 98-364 Database Fundamentals include:
 * Understand data storage
 * Understand normalization
 * Understand the reasons for normalization, the five most common levels of normalization, how to normalize a database to third normal form

Readings

 * 1)  Database normalization

Multimedia

 * 1) Microsoft Virtual Academy: Relational Concepts
 * 2) YouTube: Normalization
 * 3) YouTube: Physical Database Design Methodology

Activities

 * 1) While using the types of entities based on the College database (student, course, instructor), convert the entities to an equivalent collection of tables that are normalized and are in fourth normal form.
 * 2) * Using this information, convert the unnormalized relation to fourth normal form:
 * 3) ** Create an example of a table that is in first normal form, but not in second normal form.
 * 4) ** Create an example of a table that is in second normal form but not in third normal form.
 * 5) ** Create an example of a table that is in third normal form but not in fourth normal form.

Lesson Summary

 * Database normalization is the process of organizing the columns (attributes) and tables (relations) of a relational database to minimize data redundancy. Normalization involves decomposing a table into less redundant (and smaller) tables without losing information; defining foreign keys in the old table referencing the primary keys of the new ones. The objective is to isolate data so that additions, deletions, and modifications of an attribute can be made in just one table and then propagated through the rest of the database using the defined foreign keys.
 * First normal form (1NF) is a property of a relation in a relational database. A relation is in first normal form if and only if the domain of each attribute contains only atomic (indivisible) values, and the value of each attribute contains only a single value from that domain. First normal form enforces these criteria: Eliminate repeating groups in individual tables. Create a separate table for each set of related data. Identify each set of related data with a primary key
 * Second normal form (2NF) is a normal form used in database normalization. A table that is in first normal form (1NF) must meet additional criteria if it is to qualify for second normal form. A table is in 2NF if it is in 1NF and every non-prime attribute of the table is dependent on the whole of every candidate key.
 * Third normal form is a normal form that is used in normalizing a database design to reduce the duplication of data and ensure referential integrity by ensuring that (1) the entity is in second normal form, and (2) all the attributes in a table are determined only by the candidate keys of that table and not by any non-prime attributes.
 * Fourth normal form (4NF) is a normal form used in database normalization. Introduced by Ronald Fagin in 1977, 4NF is the next level of normalization after Boyce–Codd normal form (BCNF). Whereas the second, third, and Boyce–Codd normal forms are concerned with functional dependencies, 4NF is concerned with a more general type of dependency known as a multivalued dependency. A Table is in 4NF if and only if, for every one of its non-trivial multivalued dependencies X \twoheadrightarrow Y, X is a superkey—that is, X is either a candidate key or a superset thereof.
 * Normal forms 5 and 6 are less commonly used. It is possible to design a database in 5NF from outset and adapt it into 6NF with slight tweaks.
 * Data redundancy is the existence of data that is additional to the actual data and permits correction of errors in stored or transmitted data.

Key Terms

 * fifth normal form (5NF)
 * first normal form (1NF)
 * form
 * fourth normal form (4NF)
 * normalization
 * normal forms (NF)
 * redundant data
 * second normal form (2NF)
 * third normal form (3NF)