Tuesday, April 18, 2006

DBMS - May 1996 - Indexing and Access Mechanisms:

Here's a performance tip that's easy to use: When creating a multicolumn index, put the most selective column first (that is, the one with the most distinct values). If your company has 500 customers and five products, the create index example shown previously would be appropriate. However, if your company has five customers and 500 products, it would be better to create the index with the following command:

CREATE INDEX sales_1996_idx
ON sales_1996 (product, customer);

Because product is 100 times more selective than customer in this example, index lookups will tend to find the desired rows with fewer disk reads, thereby resulting in better performance. On top of that, you will usually be able to perform partial index lookups on product alone, with a speed close to what you would get by building a separate index on product. Furthermore, partial index lookups on customer alone aren't necessary in this case because of the low selectivity, so you don't lose anything by putting customer last. If you put customer first, you lose the ability to perform partial index lookups on product alone.

Index Storage

The most common physical storage structure for SQL indexes is the B-tree. (Note: This article will use the general term B-tree to represent all of the related variations including B+-tree and B*-tree). Almost every SQL DBMS on the market supports B-tree indexes. Plus, some DBMSs support additional physical index structures such as hashing and Index Sequential Access Method (ISAM). B-tree indexes are popular because of their adaptability (the tree structure balances itself dynamically as a table grows, which maintains an efficient index structure by minimizing the number of disk reads to find a given value in the index).

An index is usually thought of as a distinct data structure on disk, separate from the table it is indexing. Although this is generally true, some DBMSs allow you to combine the index and the table data to improve performance. For example, SQL Server has a clustered parameter on the create index command that stores the actual data at the leaf level of the index tree. Because the table is so tightly integrated with the index, only one clustered index per table is possible. Similarly, CA-OpenIngres has a modify table command that restructures a table in the form of an index.

0 Comments:

Post a Comment

<< Home