Normalization is the systematic process applied on the relations to reduce the degree of redundancy.
Normalization is defined as systematic because it always gives guarantee for following properties –
- Lossless decomposition.
- Dependency preservation.
Let’s assume a relation R is decomposed into set of relations R1, R2, R3,….Now if R1, R2, R3,… gets joined and all the data of original relation R can be derived, then such a decomposition is defined as Lossless decomposition.
Let’s assume a relation R is decomposed into R1, R2, R3…..
Now by making use of the functional dependencies of the relations R1, R2, R3……. if we can derive all the functional dependencies of the original relation R, then this property of decomposition is defined as dependency preservation.
It is the property of relation which indicates the amount of redundancy exist in a relation.
Note: Normal forms and Degree of redundancy are inversely proportional.
When the Normalization process gets applied on any relation it performs the following activity:
- It identifies the Normal form of the given relation.
- it will try to decompose the relation from its existing Normal form to the higher Normal form.
Procedure to find the highest Normal form of the relation:
- Find out all the possible candidate keys of the given relation.
- Divide the attributes of the relation into two groups –
- Identifies all the dependencies like –
- full dependencies,
- partial dependencies,
- transitive dependencies,
- overlapping candidate key dependencies.
- Refer to the definition and hierarchy of Normal forms to find out the highest possible Normal form of the given relation.
you may also like to read articles on Machine Learning:
- What is Linear Regression? Part: 1
- What is Linear Regression? Part: 2
- Co-variance and Correlation
- What is the Coefficient of Determination | R Square
- Feature Selection Techniques in Regression Model
- Important Use Cases of NLP