Skip to main content

Normalization in Action: Building Better Databases

What is Database Normalization?

Database normalization is an approach to evaluate database structure and minimize data redundancy based on candidate keys and functional dependencies. Candidate keys are unique identifiers that have all the necessary conditions to become a primary key. Functional dependencies refer to the relationship between attributes, where one attribute uniquely determines another.

Simply put normalization is a process of eliminating attributes which is not identified by candidate keys.

For example, consider a studentnumber that can uniquely identify a surname and firstname. In this case, the student number acts as a determinant, which determines the surname or firstname.

Student (studentnumber, surname, firstname)

The goal of normalization is to group attributes with a close logical relationship into the same table, so that it ensure there is no data redundancy.

Benefits:

  • Efficiently retrieve data: Well-structured data allows for faster and more accurate queries.
  • Avoid Complexity: Normalization simplifies the database structure by breaking it down into smaller, related tables.
  • Eliminate Duplicates: Redundant data can lead to anomalies (insertion, deletion, and update issues). Normalization helps minimize these.
  • Consistent Data: By organizing data, normalization ensures consistent values across related tables.

What Define Good and Bad Database Design?

Good:

  • Tables doesn't have data redundancy
  • Each determinant is a candidate key
  • Each determinant can uniquely identify other attributes

For example coursecode and offeringnumber can confidently identify other attributes CourseOffering (coursecode, offeringnumber, startdate, teachernumber)

Bad:

  • Relation/table has undesired data redundancy
  • A determinant is not a candidate key but can still identify another attribute in the table

For example teachernumber is not a candidate key but it can identify surname

CourseOffering (coursecode, offeringnumber, coursename, startdate, teachernumber, surname)

  • Partial dependency exists

(empno, clubno, clubname, joindate)

In this case empno and clubno are together a primary key and it can determine clubname, however clubno can also identify clubname, this leads to redundancy since clubname is dependent only on a part of the primary key.

Types of Normalization:

First Normal Form (1NF):

  • Each cell holds only one value (atomicity).
  • Primary key for identification.
  • No duplicated rows or columns.
  • Each column has a single value for each row.

Example: A table with employee details, where each employee has a unique ID and no duplicate rows.

Second Normal Form (2NF):

  • Builds on 1NF.
  • Ensures that non-key attributes depend entirely on the entire primary key (no partial dependencies).

Example: Separate tables for employee data and their projects, linked by employee ID.

Third Normal Form (3NF):

  • Builds on 2NF.
  • Eliminates transitive dependencies (attributes indirectly depend on other attributes).

Boyce-Codd Normal Form (BCNF) The redundancy prevention tool

  • Builds on 3NF.
  • Each determinant is a candidate key
  • All attribute values are atomic (single values)
  • If an attribute is not part of a candidate key, it must have a determinant














Comments

Popular posts from this blog

LINQ - Deferred Execution

Deferred Execution means that queries are not executed immediately at the time it's being created. The benefit of this is to improve performance by avoiding unnecessary executions. It also allows the query to be extendable when needed such as sorting, filtering. In the example below, the queries to retrieve courses are not being executed yet var context = new DbContext(); var courses = context.Courses      .Where(c => c.Level == 1)      .OrderBy(c => c.Name); The query is only executed when one of the following scenarios occurs: Iterating over query variable Calling ToList, ToArray, ToDictionary Calling First, Last, Single, Count, Max, Min, Average For example when we loop through the course or turn the result to a list: foreach ( var c in courses) Console.WriteLine(c.Name); OR context.Courses      .Where(c => c.Level == 1)      .OrderBy(c => c.Name).ToList();

Ensuring Data Integrity: The Role of Database Transactions

 1. What is database translation Database transactions are activities transferring, changing data from one consistent state to another consistent state for example in everyday life we make different kind of business transactions from buying products, changing or cancelling orders, buying tickets, etc. And all these activities involve the movement/transaction of data in the database. For example, consider a simple transaction of moving an amount of 5000 from one bank account to another. This transaction includes several steps: decrease the account balance by 5000, and then increase the other account balance by 50003. Each of these steps is a part of the transaction, and if any step fails, the entire transaction fails 2. Why it is crucial in reliable software systems? When it comes to business, whether it is personal or financial data, it is crucial to ensure the reliability and stability of data management. In our daily lives, we may frequently encounter errors related to database t...

Maximizing Efficiency: The Power of Database Indexing

What is database performance? There are two main aspects of database performance: response time and throughput . Response time is the total time it takes to process a single query and returns result to the user. It's critical metrics because it directly impacts the user's experience, especially in applications where fast access to data is essential. The response time includes CPU time (complex queries will require more computational power and increase processing time), disk access, lock waits in multiple-user environment (more about database transaction ), network traffic. Throughput refers to how many translations the system can handle per second (TPS). A transaction could include different activities to retrieve and manipulate data. A single command like SELECT, INSERT, UPDATE, DELETE or a series of commands could be used to trigger these activities. If you’re running an e-commerce site, a single transaction might include checking the inventory, confirming the payment, and...