Database Normalization is the process of restructuring the database to reduce the redundancy and unwanted characteristics like insertion, updation and deletion anamolies.Redundancy(repetition of data) increases the size of database and make it complex to handle. It leads to inconsistency of data in database.
Normalization is done for following reasons:
1) Reduce redundancy.
2) To remove Insertion, updation and deletion anamolies.
Problems before Normalization
Suppose we have following relation (table).
Now if we need to change the phone number of an employee (say John) in this table we have to change it in multiple rows. But if the update is successful only on some of the rows and not all then the data will be in a inconsistent state. This is known as update anomaly.
Consider the above relation, if an employee is not assigned any project, then we cannot insert data into the table. Say an employee (254, “Louis”,1245589746,?) . Here in this data we don’t know about the project assigned to the employee, so it can’t be recorded except if project field is give a value null. This is known as insertion anomaly.
In the above relation if we want to delete the project P1 in the table then it will lead to deletion of an employee “Mike” completely from the table which we never want. This is known as delete anomaly.
Following are the normal forms used to normalize the database so the redundancy can be reduced.
1) First Normal Form (1NF)
2) Second Normal Form (2NF)
3) First Normal Form (3NF)
5) Second Normal Form (4NF)
We will learn about these normal forms in brief here..
First Normal Form (1NF)
A relation is said to be in first normal form if every attribute is single valued attribute. Means every cell in the table must contain only one value. Also all the columns in table should have unique name. Read more about 1NF
Second Normal Form (2NF)
A relation is said to be in second normal form if
1. It is already in first normal form.
2. It should not contain any partial dependency.
Third Normal Form (3NF)
A relation is said to be in third normal form if
1. It is already in second normal form.
2. It should not contain any transitive dependency.
Boyce and Codd Normal Form (BCNF)
BCNF is a strict version of 3NF. A table is in BCNF if
1) It is in 3NF.
2) For functional dependency X→Y, X must be a super key.
Fourth Normal Form (4NF)
A table is in 4NF if
1) It is in BCNF
2) It doesn’t have multi valued dependency.