Normalization

Normalization of the database is a method of arranging a data in a database. Standardization is a structured method to decomposing data consistency tables and unwanted features such as addition, upgrading, and deletion anomalies. By extracting duplicated data from the relationship tables, a multi-step method brings data into tabular form. Database standardization is mainly used for two purposes.

  • Eliminating redundant (useless data).
  • Ensuring data dependencies make sense i.e. data is logically stored.

Normalization is defined as the process of breaking down redundant relationship schemes by breaking up their attributes into smaller relationship schemes which process desirable properties.

Objectives of Normalization Process.

  • To create a formal framework for analyzing relation schemas based on their keys.
  • To free relations from undesirable insertion, update and delete anomalies.
  • To obtain powerful relational retrieval algorithms based on a collection of primitive relational operator.
  • To reduce the need for restricting the relations as new data types are introduced.
  • To carry out a series of tests on individual relation schema so that relational database can be normalized to some degree.

Problems without Normalization:

Without normalization it is very difficult to handle and update database, without facing loss. Insertion, updating and deletion are very common operations in a database which are frequently used.

For Example:  table_student

S_idS_nameS_addressSubject
001GarishmaBhaktapurMathematics
002BipinKathmanduDBMS
003BipishaLalitpurC
004BindraPokharaC++
005AlexBiratnagarJava
006SayanaBhaktapurDSA
007BipishaLalitpurC
  • Updating Anomaly: To update address of a student who occurs twice or more than twice in a table. We will have to update S_address column in all the rows, else data will become inconsistent.
  • Insertion Anomaly: Suppose for a new admission, we have a Student id (S_id), name and address of a student but if student has not opted for any subjects yet then we have to insert NULL there, leading to insertion anomaly.
  • Deletion Anomaly: if (S_id) 001 has only one subject and temporarily he drops it, when we delete that row, entire student record will be deleted along with it.

Normalization rules are divided into following form:

First Normal Form(1NF)

The first normal states that no two rows of data must contain repeating group of information. That means that each set of columns must have a unique value that makes it impossible to use multiple columns to get the same row. Each table should be arranged in rows and each row should have a primary key which makes it special. A primary key is usually a single column but can sometimes be combined to create a single primary key.

For Example, 1:

Student Table:

S_nameS_ageSubject
Grishma18Mathematics,C
Bipin22DBMS
Alex17JAVA
Princy19AI,C++

In the first normal form any row must not have a column in which more than one value is saved, like separated with commas. These records must separate into multiple rows. Mathematics and C-programming are divided by comma in the above table.

Student Table following 1NF will be:

S_nameS_ageSubject
Grishma18Mathematcis
Grishma18C
Bipin22DBMS
Alex17JAVA
Princy19AI
Princy19C++

The data consistency increases with the first Normal Type, as there would be several columns of the same data in different rows but each row as a whole special.

For Example, 2:

Student Table:

S_nameS_weight
Grishma48 kg
Bipin55 kg
Alex42 kg
Princy16 kg

The above table is not in First Normal Form. In the first normal form, any row must not have a column in which more than one value is saved, separated into multiple columns by a space these records need to separate. Blank space divides the weight of the table and its unit in above.

Student Table following 1NF will be:

S_nameS_weightUnit
Grishma48kg
Grishma55kg
Bipin42kg
Alex16kg
Princy48kg

As there will be many columns with same data in multiple columns as a whole unique.

Second Normal Form (2NF)

The second normal form is based on dependencies. All primary key attributes of none are fully functional, depending on the main key attributes. The relation has to be in 1NF for 2NF.

For Example:

Item Table:

itemcolorpriceTax
T-shirtRed12000.13
T-shirtGreen12000.13
JacketRed25000.13
JacketYellow25000.13

The above table is not second normal form because price and tax depends on item but not in color

Item Table following 2NF will be:

These tables are in second normal form.

ItemColor
T-shirtRed
T-shirtGreen
JacketRed
JacketRed
ItemPriceTax
T-shirt12000.13
Jacket25000.15

Third Normal Form (3NF)

A relation is in the third normal form if it is in the second normal form and non-primary key attributes must non-transitive dependent upon primary key attributes. No non-key filed depends upon another. All the non-key fields depend on the primary key only.

Example: 1

ItemColor
T-shirtRed
T-shirtGreen
JacketRed
JacketRed
ItemPriceTax
T-shirt12000.13
Jacket25000.15

The above table are not in 3NF because : Tax depends on price not on item.

ItemColor
T-shirtRed
T-shirtGreen
JacketRed
JacketRed
ItemPrice
T-shirt1200
Jacket2500
PriceTax
12000.13
25000.15

Now the above table is Normalized in 3NF

For Example, 2:

NameAssignment 1Assignment 2
Niranjan Dassql queryFeature of RDBMS
Rajiv Shahsql queryRelational Alzebra
Chandan Dassql queryFeature of RDBMS
Assignment IDDescription
1sql query
2Features of RDBMS
3Relational Algebra
Assignment IDStudent ID
11
12
13
21
23
32
Student IDFirst NameLast Name
1NiranjanDas
2RajivShah
3chandanDas

image_pdfOpen as Pdfimage_printPrint

About the author

ProPathshala

View all posts

Leave a Reply

Your email address will not be published. Required fields are marked *