What is database performance?
There are two main aspects of database performance: response time and throughput.
Response time is the total time it takes to process a single query and returns result to the user. It's critical metrics because it directly impacts the user's experience, especially in applications where fast access to data is essential. The response time includes CPU time (complex queries will require more computational power and increase processing time), disk access, lock waits in multiple-user environment (more about database transaction), network traffic.
Throughput refers to how many translations the system can handle per second (TPS). A transaction could include different activities to retrieve and manipulate data. A single command like SELECT, INSERT, UPDATE, DELETE or a series of commands could be used to trigger these activities. If you’re running an e-commerce site, a single transaction might include checking the inventory, confirming the payment, and updating the shipping details. Each of these steps might involve several queries and updates to the database.
Common ways to improve database performance:
- Create indexes on tables
- Improve SQL queries
- Distribute data access on disks
- Increase main memory of database server
- Split and merge tables
- Denormalise one or more tables: this can increase data redundancy but will speed up read operation because fewer tables joins are needed
What is database indexing?
Database indexing is a technique used to speed up the retrieval of data from a database. It’s similar to the index at the end of a book. Without an index, to find a particular topic in a book, you’d have to go through each page, which can be time-consuming. But with an index, you can directly go to the pages that contain the topic you’re looking for.
Advantages and disadvantages
Advantages
- Speed up data retrial: find data faster without having to scan every row
- Efficient use of resource: make database system more efficient and can serve more queries in same system resources (CPU, Memory)
Disadvantages
- Indexes will need more disk space and also maintenance.
- The DBMS has to update all indexes on the table when a row is inserted or deleted. When an existing row is updated the DBMS has to update related indexes accordingly. So if there are too many indexes on a table the performance may decrease on insert, update, and delete operation.
How to use Index?
Usually, indexes are created on:
- Primary keys: SQL Server automatically creates a unique clustered index on the PK
- Foreign key: Use CREATE INDEX to create an index on a foreign key.
- Alternate keys: In SQL Server, UNIQUE constraint is physically implemented as a unique non-clustered index. SQL Server automatically creates this index.
In SQL Server, we can create index using the syntax below.
CREATE [ UNIQUE ] [ CLUSTERED | NONCLUSTERED ] INDEX index_name ON table_name ( column_name [ { , column_name }... ] )
For example:
CREATE INDEX ix_FK_deptno ON Employee(deptno)
To drop an index: DROP INDEX index_name ON table_name
- Tables that have a very small number of rows: Indexes are most effective on tables with a large number of rows. For small table, the overhead of maintaining an index (updating it every time data is inserted, updated, or deleted) can outweigh the performance benefits.
- Columns that have a very low selectivity: A column like gender, which typically has only a few distinct values (e.g., ‘male’, ‘female’), has low selectivity. Indexes on such columns are often not beneficial because many rows have the same value, so the database still ends up scanning a large number of rows even when using the index.
- Very large columns: Indexes on very large columns can consume a significant amount of storage space and can be slow to update. This is because the index contains a copy of the data in the indexed columns. If the data is very large, the index will also be large. This can slow down insert, update, and delete operations because the database has to keep the index up-to-date.
Comments
Post a Comment