Basics of normalization

From http://sqlmag.com/database-performance-tuning/sql-design-why-you-need-database-normalization

When you normalize a database, you have four goals:

  • Arranging data into logical groupings such that each group describes a small part of the whole
  • Minimizing the amount of duplicate data stored in a database
  • Organizing the data such that, when you modify it, you make the change in only one place
  • Building a database in which you can access and manipulate the data quickly and efficiently without compromising the integrity of the data in storage.

A poorly normalized database and poorly normalized tables can cause problems ranging from excessive disk I/O and subsequent poor system performance to inaccurate data. An improperly normalized condition can result in extensive data redundancy, which puts a burden on all programs that modify the data.

First Normal Form

For a table to be in 1NF you need to ensure that the data is atomic, having no repeating groups.

A data item is atomic if only one item is in each cell of a table.

Ex: Attribute ShippedTo encompasses a street address, a city, a state, a postal code, and a country abbreviation. To render this data atomic, you separate this single attribute into several—ShipAddr, ShipCity, ShipState, ShipZip, and ShipCountry.

Repeating groups are cells that have more than one occurrence.

1NF requires that a table have a primary key

Second Normal Form

2NF is a condition of full functional dependency on the whole primary key; the primary key must determine each non-primary key attribute.

2NF says that all non-primary key attributes must be fully functionally dependent on the whole primary key.

Third Normal Form

You achieve 3NF when you have resolved all transitive dependencies. You’ll have to test the attributes in each table to see whether, within a table, any non-key attribute determines the value of another non-key attribute. Such a determination defines transitive dependency. Attributes that do not contribute to the description of the primary key are removed from the table.

  • Each table is a flat file, or spreadsheet format, with all-atomic data items, no repeating groups, and a designated pkey.
  • Each table has all non-pkey attributes fully functionally dependent on the whole pkey.
  • All transitive dependencies are removed from each table.
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s