Skip to main content

Ensuring Data Integrity: The Role of Database Transactions

 1. What is database translation

Database transactions are activities transferring, changing data from one consistent state to another consistent state for example in everyday life we make different kind of business transactions from buying products, changing or cancelling orders, buying tickets, etc. And all these activities involve the movement/transaction of data in the database.

For example, consider a simple transaction of moving an amount of 5000 from one bank account to another. This transaction includes several steps: decrease the account balance by 5000, and then increase the other account balance by 50003. Each of these steps is a part of the transaction, and if any step fails, the entire transaction fails

2. Why it is crucial in reliable software systems?

When it comes to business, whether it is personal or financial data, it is crucial to ensure the reliability and stability of data management.

In our daily lives, we may frequently encounter errors related to database transactions like customer orders, payments, or product shipment being lost in an online store, or fail to register a seat reservation, double booking a train/airplane passengers.

To ensure reliability of data operations we could follow ACID properties:

  • Atomicity: "All or nothing". Either the DBMS makes all the changes to the database state permanent or no change is made permanent.
  • Consistency: Database must be consistent before and after a transaction. The transaction should not be allowed to commit if any rule violation occurs
  • Isolation: If multiple transactions are happening at the same time (concurrently), they should be isolated from each others, and only happens one after another to keep the consistency and avoid conflicts
  • Durability: Once a transaction is done and successful, it should be permanent regardless of any failure like program crash

3. What is SQL transaction and SQL statements need to use to control SQL transaction?

SQL Transaction: An SQL transaction consists of a sequence of SQL statements that are executed as a single atomic unit. This means the transaction is executed in an “all-or-nothing” manner. When an application triggers a transaction, it may involve various operations such as setting the access mode, performing constraint checks, establishing transaction isolation levels, or defining the start and end times of the transaction.

Workflow for SQL transactions:

Begin transaction: starts a new transaction (explicit transaction start in SQL Server)

Commit: terminates the transaction and make all changes permanent

Rollback: terminate the transaction and roll back all the changes ( previous state will be stored in database)

BEGIN TRANSACTION
UPDATE account SET balance = balance + 100 WHERE accountNo = 2
UPDATE account SET balance = balance - 100 WHERE accountNo = 1
  if (everything succeeded) then
   COMMIT -- commit to the changes and make them permanent
  else
   ROLLBACK -- undo all changes from the database
  end-if

4. What is concurrency control?

A mechanism to make sure that a transaction is isolated from the effect of other transactions, that concurrently happens at the same time. The level of isolation should be set by programmer like below: 

Locking 

When a transaction changes a row in database with INSERT, UPDATE, DELETE, the DBMS lock the row exclusively so that other concurrent transactions are not allowed to access (read or modify) the row as long as this transaction is running.

Locking resolves concurrency conflicts in two ways: 

  • Blocking a Transaction: In this scenario, a transaction is required to wait for the necessary resource. By default, the maximum duration for this waiting period is set to an indefinite length of time, often referred to as “forever”.
  • Forcing a transaction to abort: When two transactions start to permanently block each other, the DBMS forces either one transaction to abort (rollback). This allows the other transaction to continue normally.

Deadlock

This occurs when two or more transactions permanently block each other by each transaction having a lock on a resource which the other transaction is trying to lock exclusively. 

For example, transaction A has locked a row in the Students table and needs to update some rows in the Grades table. At the same time, Transaction B has locked those very rows in the Grades table that Transaction A needs to update, but Transaction B needs to update the rows in the Student table that are locked by Transaction A. Now, Transaction A is waiting for Transaction B to release its lock, and Transaction B is waiting for Transaction A to release its lock. They’re both stuck waiting for each other, and that’s a deadlock actions permanently block each other by each transaction having a lock on a resource which the other transaction is trying to lock exclusively.

When the DBMS detects a deadlock it chooses one of the transactions as a victim and terminates it (rollback) with an error message. This allows the other transaction to continue normally.

Locks protect your data but decrease concurrency. Even without a deadlock a transaction can make others wait for a long time and it can seriously affect the performance of the application.

Row-level locking in SQL Server

Depending on the lock type, when one user has a lock on a row, the lock prevents other users from modifying or even reading that row. The basic lock types are write lock and read lock. In SQL Server, to avoid a certain type of deadlock, "intent to update lock" is also available

Write lock : This allows user to read and modify the database and other users are not allow to write or even read until the transaction ends

Read lock: this allows user can select a row if another transaction doesn't have the Write Lock on. There are two type of Read Lock:

  • READ COMMITTED: Lock is released immediately after user has finished reading
  • REPEATABLE READ OR SERIALIZABLE: lock is not released before the transaction ends

"Intent to update" lock (UPDATE LOCK) In SQL Server: this prevents a certain form of deadlock that occurs when two transactions are first reading the same row and then updating the row. The DBMS allows only one transaction at time to read the same row with the UPDLOCK. If another transaction tries to read the same row with the UPDLOCK, it has to wait until the first transaction ends.

5. Best practices

  • Avoid User Interaction During Transactions: To keep things running smoothly, it’s best not to have users do anything inside a transaction. If they do, the database transaction might be blocking other concurrent transactions for all too long.
  • Understand Your Database System: Different database systems have their own ways of handling concurrency control and managing behavior. It is important for reliability and performance that a developer must know how their chosen database system works in these areas.
  • Always Ensure Accurate Data: To prevent mistakes in your database, you should always explicitly set the transaction isolation level for a database connection.

In SQL Server, Oracle, and PostgreSQL the default transaction isolation level is READ COMMITTED. This transaction isolation level provides maximum concurrency, which allows the most users to work at the same time, but it is usually not enough for such transactions that both read and update the database in a multi-user environment.










Comments

Popular posts from this blog

Maximizing Efficiency: The Power of Database Indexing

What is database performance? There are two main aspects of database performance: response time and throughput . Response time is the total time it takes to process a single query and returns result to the user. It's critical metrics because it directly impacts the user's experience, especially in applications where fast access to data is essential. The response time includes CPU time (complex queries will require more computational power and increase processing time), disk access, lock waits in multiple-user environment (more about database transaction ), network traffic. Throughput refers to how many translations the system can handle per second (TPS). A transaction could include different activities to retrieve and manipulate data. A single command like SELECT, INSERT, UPDATE, DELETE or a series of commands could be used to trigger these activities. If you’re running an e-commerce site, a single transaction might include checking the inventory, confirming the payment, and...

LINQ - Deferred Execution

Deferred Execution means that queries are not executed immediately at the time it's being created. The benefit of this is to improve performance by avoiding unnecessary executions. It also allows the query to be extendable when needed such as sorting, filtering. In the example below, the queries to retrieve courses are not being executed yet var context = new DbContext(); var courses = context.Courses      .Where(c => c.Level == 1)      .OrderBy(c => c.Name); The query is only executed when one of the following scenarios occurs: Iterating over query variable Calling ToList, ToArray, ToDictionary Calling First, Last, Single, Count, Max, Min, Average For example when we loop through the course or turn the result to a list: foreach ( var c in courses) Console.WriteLine(c.Name); OR context.Courses      .Where(c => c.Level == 1)      .OrderBy(c => c.Name).ToList();

Solid Principles for Software Design - Part 2

Interface Segregation Principle The Interface Segregation Principle (ISP) is one of the five SOLID principles of object-oriented design, which recommends that "Clients should not be forced to depend on interfaces they do not use". This means we should avoid implementing an interface that has unnecessary methods and therefore not going to be implemented.  Some signs of violating ISP are: Having a "fat" interface, which means having a high number of methods in one interface that are not so related to each other or low cohesion. Empty implementation of methods, certain methods of interface are not needed for implementation. Considering the following example, we violate the principle because CannonPrinter is designed only with the functionality to print, leaving the scan and fax method unimplemented. interface IMultiFunction {      void print(); void scan(); void fax(); } public class HPPrinterNScanner implements ImultiFunction { @Override public void pr...