Nitin Kumar Choudhary: INDEXS in SQL SERVER

A SQL table explanation is not good enough for getting the desired data very quickly or sorting the data in a specific order.

What we actually need for doing this is some sort of cross reference facilities where for certain columns of information within a table, it should be possible to get whole records of information quickly. But if we consider a huge amount of data in a table, we need some sort of cross reference to get to the data very quickly. This is where an index within SQL Server comes in.

So an index can be defined as:

“An index is an on-disk structure associated with a table or views that speed retrieval of rows from the table or view. An index contains keys built from one or more columns in the table or view”. These keys are stored in a structure (B-tree) that enables SQL Server to find the row or rows associated with the key values quickly and efficiently.”

Index Structures

For example, if you create an index on the primary key and then search for a row of data based on one of the primary key values, SQL Server first finds that value in the index, and then uses the index to quickly locate the entire row of data. Without the index, a table scan would have to be performed in order to locate the row, which can have a significant effect on performance.

An index is made up of a set of pages (index nodes) that are organized in a B-tree structure. This structure is hierarchical in nature, with the root node at the top of the hierarchy and the leaf nodes at the bottom, as shown in Figure below.

Figure : B-tree structure of a SQL Server index

When a query is issued against an indexed column, the query engine starts at the root node and navigates down through the intermediate nodes. The query engine continues down through the index nodes until it reaches the leaf node.

Clustered Indexes

A clustered index stores the actual data rows at the leaf level of the index. Returning to the example above, that would mean that the entire row of data associated with the primary key value of 123 would be stored in that leaf node.

An important characteristic of the clustered index is that the indexed values are sorted in either ascending or descending order. As a result, there can be only one clustered index on a table or view. In addition, data in a table is sorted only if a clustered index has been defined on a table.

Indexes are first sorted on the first column in the index, then any duplicates of the first column and sorted by the second column, etc.

Nonclustered Indexes

In non-clustered index the leaf nodes of a nonclustered index contain only the values from the indexed columns and row locators that point to the actual data rows, rather than contain the data rows themselves. This means that the query engine must take an additional step in order to locate the actual data.

A row locator’s structure depends on whether it points to a clustered table or to a heap.
1. If referencing a clustered table, the row locator points to the clustered index, using the value from the clustered index to navigate to the correct data row.
2. If referencing a heap, the row locator points to the actual data row.

(Note: A table that has a clustered index is referred to as a clustered table. A table that has no clustered index is referred to as a heap.)

--> Nonclustered indexes cannot be sorted like clustered indexes.

Max number of clustered index in table: 1

Max number of non-clustered index in table:

In sql server 2005- 249
in sql server 2008- 999

Composite index:
An index that contains more than one column. In both SQL Server 2005 and 2008, you can include up to 16 columns in an index.

Consider the following guidelines when planning your indexing strategy:

1. For tables that are heavily updated, use as few columns as possible in the index, and don’t over-index the tables.

2. If a table contains a lot of data but data modifications are low, use as many indexes as necessary to improve query performance.

3. For clustered indexes, try to keep the length of the indexed columns as short as possible.

4. The uniqueness of values in a column affects index performance. In general, the more duplicate values you have in a column, the more poorly the index performs.

5. In multi-column indexes, list the most selective (nearest to unique) first in the column list. For example, when indexing an employee table for a query on social security number (SSN) and last name (lastName), your index declaration should be:

CREATE NONCLUSTERED INDEX ix_Employee_SSN
ON dbo.Employee (SSN, lastName);

Syntax:

Create Clustered Index index_name on table_name (Columns_name)

CREATE NONCLUSTERED INDEX ix_Employee_SSN
ON dbo.Employee (SSN, lastName);

Disadvantages:

1. Both clustered indexes, and nonclustered indexes take up additional disk space. The amount of space that they require will depend on the columns in the index, and the number of rows in the table.

2. Indexes will increase the amount of time that your INSERT, UPDATE and DELETE statement take, as the data has to be updated in the table as well as in each index.

3. Columns of the TEXT, NTEXT and IMAGE data types can not be indexed using normal indexes. Columns of these data types can only be indexed with Full Text indexes.

Disadvantage for Clustured Index:

If we update a record and change the value of an indexed column in a clustered index, the database might need to move the entire row into a new position to keep the rows in sorted order. This behavior essentially turns an update query into a DELETE followed by an INSERT, with an obvious decrease in performance. A table's clustered index can often be found on the primary key or a foreign key column, because key values generally do not change once a record is inserted into the database.

Disadvantage for NonClustured Index :

The disadvantage of a non-clustered index is that it is slightly slower than a clustered index and they can take up quite a bit of space on the disk.

Another disadvantage is using too many indexes can actually slow your database down. Thinking of a book again, imagine if every "the", "and" or "at" was included in the index. That would stop the index being useful - the index becomes as big as the text! On top of that, each time a page or database row is updated or removed, the reference or index also has to be updated.

The DROP INDEX Command:

An index can be dropped using SQL DROP command. Care should be taken when dropping an index because performance may be slowed or improved.
The basic syntax is as follows:

DROP INDEX index_name;

Nitin Kumar Choudhary

Thursday, January 10, 2013

INDEXS in SQL SERVER

Index Structures

Clustered Indexes

Nonclustered Indexes

No comments:

Post a Comment