Difference Between Normalization And Denormalization In SQL

SHARE

What Is Normalization?

Normalization is the process of organizing (decomposing) the data in a relational database in accordance with a series of normal forms in order to reduce data redundancy, improve data integrity and to remove Insert, Update and Delete Anomalies.

By normalizing a database, you arrange the data into tables and columns. You ensure that each table contains only related data. If data is not directly related, you create a new table for that data. Normalization is an important part of relational database design for many reasons, but mainly because it allows database to take up as little disk space as possible, resulting in increased speed, accuracy and efficiency of the database.

Three main types of normalization or normal forms (NF) are:

  • 1NF
  • 2NF
  • 3NF
  • BCNF
  • 4NF
  • 5NF

What You Need To Know About Normalization

  1. Normalization is the process of organizing (decomposing) the data in a relational database in accordance with a series of normal forms in order to reduce data redundancy, improve data integrity and to remove Insert, Update and Delete Anomalies.
  2. Normalization is used in OLTP system which emphasizes on making the insert, delete and update anomalies faster.
  3. Normalization main focus is on clearing the database from unused data and to reduce the data redundancy and inconsistency.
  4. Normalization removes data redundancy i.e.  It eliminates any duplicate data from the same table and puts it into a separate new table.
  5. In normalization, memory space is optimized because duplicate data is organized and stored in different places (tables).
  6. Normalization maintains data integrity i.e any addition or deletion of data from the table will not create any mismatch in the relationship of the tables.
  7. Normalization is generally used where number of insert/update/delete operations are performed and joins of those tables are not expensive.
  8. During Normalization as data is reduced so a number of tables are deleted from the database hence tables are lesser in number.
  9. Even though normalization creates multiple tables, insert, update and delete are more efficient in this regard. If we have to insert/update/delete any data, we have to perform the transaction in that particular table. Therefore, there is no fear of data loss (data integrity).

What Is Denormalization?

Denormalization is the process where data from multiple tables are combined into a single table, so that data retrieval will be faster. Denormalization is a strategy that database managers use to increase the performance of a database infrastructure.

Usually denormalization entails creating separate tables or structures so that queries on one piece of information will not affect any other information tied to it. When a query combines data from multiple tables into a single result table, it is referred to as a join. The performance of such a join in the face of complex queries is often the occasion for the administrator to explore the denormalization alternative.

The main methods of denormalization are:

  • Adding Redundant columns
  • Adding derived columns
  • Collapsing the tables
  • Snapshots
  • VARRAYS
  • Materialized Views

What You Need To Know About Denormalization

  1. Denormalization is the process where data from multiple tables are combined into one table, so that data retrieval will be faster.
  2. Denormalization is used in OLAP system which emphasizes on making the search and analysis faster.
  3. Denormalization main focus is to achieve faster execution of the queries through introducing redundancy.
  4. Denormalization creates data redundancy i.e duplicate data may be found in the same table.
  5. Memory space is not optimized in denormalization because duplicate data is stored in a single place (table).
  6. Denormalization does not maintain any data integrity.
  7. Denormalization is used where joins are expensive and frequent query is executed on the tables.
  8. In denormalization, all duplicate data is in a single table and care need to be taken into account when inserting/updating or deleting all related data in that table. Failure in doing that can give rise to integrity issues.
  9. During Denormalization data is integrated into the same database and hence a number of tables to store that data increases in number.

Also Read: Difference Between OODBMS And ORDBMS

Difference Between Normalization And Denormalization In SQL, In Tabular Form

BASIS OF COMPARISON NORMALIZATION DEMORMALIZATION
Description It is the process of organizing (decomposing) the data in a relational database in accordance with a series of normal forms in order to reduce data redundancy, improve data integrity and to remove Insert, Update and Delete Anomalies. It is the process where data from multiple tables are combined into one table, so that data retrieval will be faster.
Application It is used in OLTP system which emphasizes on making the insert, delete and update anomalies faster.   It is used in OLAP system which emphasizes on making the search and analysis faster.  
Main Focus Main focus is on clearing the database from unused data and to reduce the data redundancy and inconsistency.   Main focus is to achieve faster execution of the queries through introducing redundancy.  
Data Redundancy It removes data redundancy i.e.  It eliminates any duplicate data from the same table and puts it into a separate new table. It creates data redundancy i.e duplicate data may be found in the same table.  
Memory Space Optimization Memory space is optimized because duplicate data is organized and stored in different places (tables). Memory space is not optimized in denormalization because duplicate data is stored in a single place (table).  
Data Integrity Maintains data integrity i.e any addition or deletion of data from the table will not create any mismatch in the relationship of the tables. It does not maintain any data integrity.
Use It is generally used where number of insert/update/delete operations are performed and joins of those tables are not expensive. It is used where joins are expensive and frequent query is executed on the tables.
Duplicate Data As data is reduced so a number of tables are deleted from the database hence tables are lesser in number. All duplicate data is in a single table and care need to be taken into account when inserting/updating or deleting all related data in that table. Failure in doing that can give rise to integrity issues.  
Number Of Data Even though normalization creates multiple tables, insert, update and delete are more efficient in this regard.   Data is integrated into the same database and hence a number of tables to store that data increases in number.  

Advantages of Normalization

  • Users can extent the database without necessarily impacting the exiting data.
  • Minimizes  null values
  • Helps to reduce or avoid modification problems
  • Searching, sorting and creating indexes can be faster since tables are narrower and more rows fit on a data page.
  • Simplifies queries
  • Makes database smaller by eliminating redundant data.
  • It delete anomalies that will cause an error in the system.
  • Does not waste storage space
  • It results in a more compact database (due to less data redundancy)
  • It results in database being simpler and easier to understand.

Disadvantages of Normalization

  • Normalization is a very difficult task because it requires detailed analysis and design of the database.
  • Normalization creates a tedious task, because there are more tables to join
  • Normalization results in results in slow performance of the entire database system because tables contain codes rather than real data.
  • It makes the query more difficult, because it consists of an SQL that is constructed dynamically and is usually constructed by desktop friendly query tools, hence it is hard to model the database without knowing what the customers desires.
  • A poorly normalized database may perform badly and store data inefficiently.
  • Normalizing a database is sometimes complex because analysts have to understand the purpose of the database such as whether it should be optimized for writing data, reading data or both.

Advantages Of Denormalization

  • It reduces the number of foreign keys and indexes. This helps to save on data manipulation time and memory as well.
  • It minimizes the  number of necessary join queries
  • In some cases, it reduces number of tables in the database.
  • Improves performance of the database by increasing speed

Disadvantages Of Denormalization

  • Denormalization usually speeds retrieval but can slow updates.
  • Denormalization is always application specific and therefore requires to be evaluated if the application changes.
  • Denormalization can increase the size of tables.
  • Denormalization can make update and insert code difficult to write.
  • Data redundancy necessitates more storage.
  • Denormalization does not maintain any data integrity.
  • Waste storage space.

Comments are closed.