Skip to main content

Maximizing Efficiency: The Power of Database Indexing

What is database performance?

There are two main aspects of database performance: response time and throughput.

Response time is the total time it takes to process a single query and returns result to the user. It's critical metrics because it directly impacts the user's experience, especially in applications where fast access to data is essential. The response time includes CPU time (complex queries will require more computational power and increase processing time), disk access, lock waits in multiple-user environment (more about database transaction), network traffic.

Throughput refers to how many translations the system can handle per second (TPS). A transaction could include different activities to retrieve and manipulate data. A single command like SELECT, INSERT, UPDATE, DELETE or a series of commands could be used to trigger these activities. If you’re running an e-commerce site, a single transaction might include checking the inventory, confirming the payment, and updating the shipping details. Each of these steps might involve several queries and updates to the database.

Common ways to improve database performance:

  • Create indexes on tables
  • Improve SQL queries
  • Distribute data access on disks
  • Increase main memory of database server
  • Split and merge tables
  • Denormalise one or more tables: this can increase data redundancy but will speed up read operation because fewer tables joins are needed

What is database indexing?

Database indexing is a technique used to speed up the retrieval of data from a database. It’s similar to the index at the end of a book. Without an index, to find a particular topic in a book, you’d have to go through each page, which can be time-consuming. But with an index, you can directly go to the pages that contain the topic you’re looking for.

Advantages and disadvantages

Advantages

  • Speed up data retrial: find data faster without having to scan every row
  • Efficient use of resource: make database system more efficient and can serve more queries in same system resources (CPU, Memory)

Disadvantages

  • Indexes will need more disk space and also maintenance.
  • The DBMS has to update all indexes on the table when a row is inserted or deleted. When an existing row is updated the DBMS has to update related indexes accordingly. So if there are too many indexes on a table the performance may decrease on insert, update, and delete operation.

How to use Index?

Usually, indexes are created on:

  • Primary keys: SQL Server automatically creates a unique clustered index on the PK
  • Foreign key: Use CREATE INDEX to create an index on a foreign key.
  • Alternate keys: In SQL Server, UNIQUE constraint is physically implemented as a unique non-clustered index. SQL Server automatically creates this index.

In SQL Server, we can create index using the syntax below.

CREATE [ UNIQUE ] [ CLUSTERED | NONCLUSTERED ] INDEX index_name ON table_name ( column_name [ { , column_name }... ] )

For example:

CREATE INDEX ix_FK_deptno ON Employee(deptno)

To drop an index: DROP INDEX index_name ON table_name


Indexes usually are not useful for:
  • Tables that have a very small number of rows: Indexes are most effective on tables with a large number of rows. For small table, the overhead of maintaining an index (updating it every time data is inserted, updated, or deleted) can outweigh the performance benefits.
  • Columns that have a very low selectivity: A column like gender, which typically has only a few distinct values (e.g., ‘male’, ‘female’), has low selectivity. Indexes on such columns are often not beneficial because many rows have the same value, so the database still ends up scanning a large number of rows even when using the index.
  • Very large columns: Indexes on very large columns can consume a significant amount of storage space and can be slow to update. This is because the index contains a copy of the data in the indexed columns. If the data is very large, the index will also be large. This can slow down insert, update, and delete operations because the database has to keep the index up-to-date.
















Comments

Popular posts from this blog

LINQ - Deferred Execution

Deferred Execution means that queries are not executed immediately at the time it's being created. The benefit of this is to improve performance by avoiding unnecessary executions. It also allows the query to be extendable when needed such as sorting, filtering. In the example below, the queries to retrieve courses are not being executed yet var context = new DbContext(); var courses = context.Courses      .Where(c => c.Level == 1)      .OrderBy(c => c.Name); The query is only executed when one of the following scenarios occurs: Iterating over query variable Calling ToList, ToArray, ToDictionary Calling First, Last, Single, Count, Max, Min, Average For example when we loop through the course or turn the result to a list: foreach ( var c in courses) Console.WriteLine(c.Name); OR context.Courses      .Where(c => c.Level == 1)      .OrderBy(c => c.Name).ToList();

Solid Principles for Software Design - Part 2

Interface Segregation Principle The Interface Segregation Principle (ISP) is one of the five SOLID principles of object-oriented design, which recommends that "Clients should not be forced to depend on interfaces they do not use". This means we should avoid implementing an interface that has unnecessary methods and therefore not going to be implemented.  Some signs of violating ISP are: Having a "fat" interface, which means having a high number of methods in one interface that are not so related to each other or low cohesion. Empty implementation of methods, certain methods of interface are not needed for implementation. Considering the following example, we violate the principle because CannonPrinter is designed only with the functionality to print, leaving the scan and fax method unimplemented. interface IMultiFunction {      void print(); void scan(); void fax(); } public class HPPrinterNScanner implements ImultiFunction { @Override public void pr...