We’ll often hear SQL Server developers talk about adding indexes to a table to improve the performance of some of the more heavily run queries. Such queries could be quite complex, and/or search through large tables, perhaps with lots of joins, and generally take a long time to complete.
But it’s not just a matter of saying “add an index” and being done with it. We need to decide how to design the index, and even before that, what type of index to add.
Two of the most commonly used index types in SQL Server are the clustered index and the nonclustered index.
In this article, we’ll explore what a nonclustered index is, how it works, and when we should use it to enhance our database performance.
Understanding Nonclustered Indexes
A nonclustered index in SQL Server is a separate structure from the table that contains a copy of selected columns from the base table. It’s organized in a way that allows for fast retrieval of data based on the indexed columns, without having to scan the entire table.
Think of a nonclustered index as the index at the back of a book. It doesn’t change the order of the pages in the book (our data table), but it provides a quick way to find specific information by listing keywords and the pages where they appear.
How Nonclustered Indexes Work
When we create a nonclustered index, SQL Server builds a separate structure that contains:
- The key columns we’ve specified in the index definition
- A row locator that points back to the corresponding row in the table
For heap tables (tables without a clustered index), the row locator is a Row ID (RID). For tables with a clustered index, the row locator is the clustered index key.
When a query uses a nonclustered index, SQL Server can quickly scan the index to find the desired data and then use the row locator to retrieve the full row data if needed.
Comparing Nonclustered and Clustered Indexes
To better understand nonclustered indexes, let’s compare them to clustered indexes:
- Storage: A clustered index determines the physical order of data in a table. In contrast, a nonclustered index is stored separately from the table data.
- Quantity: We can have only one clustered index per table, but we can have multiple nonclustered indexes.
- Lookup speed: Clustered indexes are generally faster for range queries, while nonclustered indexes can be faster for selective queries.
- Size: Nonclustered indexes are typically smaller than the table itself, as they only contain the indexed columns and row locators.
When to Use Nonclustered Indexes
We should consider creating nonclustered indexes in the following scenarios:
- Frequent search conditions: Columns that are often used in
WHERE
clauses orJOIN
conditions are good candidates for nonclustered indexes. - Foreign key columns: Creating nonclustered indexes on foreign key columns can significantly improve the performance of joins between tables.
- Columns used in
ORDER BY
orGROUP BY
: Indexing these columns can speed up sorting and aggregation operations. - Selective queries: When our queries return a small percentage of rows from a large table, nonclustered indexes can be very effective.
Creating a Nonclustered Index
Let’s look at an example of how to create a nonclustered index in SQL Server:
CREATE TABLE Employees (
EmployeeID INT PRIMARY KEY,
FirstName NVARCHAR(50),
LastName NVARCHAR(50),
Department NVARCHAR(50),
Salary DECIMAL(10, 2)
);
-- Create a nonclustered index on the Department column
CREATE NONCLUSTERED INDEX IX_Employees_Department
ON Employees (Department);
-- Create a composite nonclustered index on LastName and FirstName
CREATE NONCLUSTERED INDEX IX_Employees_Name
ON Employees (LastName, FirstName);
In this example, we’ve created two nonclustered indexes:
- A single-column index on the
Department
column - A composite index on
LastName
andFirstName
These indexes will improve the performance of queries that search or sort by department or employee names.
We also created a clustered index on the table. We know this because we created a primary key. And by default, whenever we create a primary key, SQL Server creates a clustered index. That is unless the table already has a clustered index, or we explicitly specify that the primary key should be a nonclustered index. If the table already has a clustered index, then the primary key will be a nonclustered index by default (remember that there can only be one clustered index on a table).
Viewing Existing Nonclustered Indexes
We can use the sp_helpindex
stored procedure to get a list of indexes on a given table:
EXEC sp_helpindex 'dbo.Employees';
Result:
index_name index_description index_keys
------------------------------ ------------------------------------------------- -------------------
IX_Employees_Department nonclustered located on PRIMARY Department
IX_Employees_Name nonclustered located on PRIMARY LastName, FirstName
PK__Employee__7AD04FF11150B041 clustered, unique, primary key located on PRIMARY EmployeeID
We can see in the index_description
column whether it’s nonclustered or not.
Another way to do it is by querying the sys.indexes
view, which provides us with more control over the columns we want returned and filtering the result.
Querying with Nonclustered Indexes
Typically, whenever we run a query, SQL Server will determine whether or not to use a nonclustered index. We can check the query plan to see whether or not it chose the index.
We also have the option of providing SQL Server with a “hint” to use the index. That is, we can specify in our query that we’d like SQL Server to use the index for this query, even if SQL Server wouldn’t have normally chosen to use the index.
As alluded to, just because we create an index, it doesn’t mean that it will be used for all queries. It all depends on whether or not SQL Server decides that it would be more efficient to use the index.
But we can usually have a good idea of when SQL Server will use an index, based on the design of the index.
For example, it’s likely that SQL Server will use the above indexes for the following queries.
The following query can use the IX_Employees_Department
index:
SELECT FirstName, LastName
FROM Employees
WHERE Department = 'Sales';
This one can use the IX_Employees_Name
index:
SELECT EmployeeID, Department
FROM Employees
WHERE LastName = 'Smith' AND FirstName = 'John';
And this one can use the IX_Employees_Name
index for sorting:
SELECT EmployeeID, FirstName, LastName
FROM Employees
ORDER BY LastName, FirstName;
In these examples, SQL Server can use the nonclustered indexes to quickly locate the required data without scanning the entire table.
However, if those tables contain only a small amount of data, then SQL Server might find it more efficient to do bypass the index and do a table scan instead.
Maintaining Nonclustered Indexes
While nonclustered indexes can significantly improve query performance, they do come with some maintenance overhead:
- Storage space: Each index requires additional storage space.
- Insert, Update, Delete operations: These operations become slightly slower as the indexes need to be updated along with the table data.
- Index fragmentation: Over time, indexes can become fragmented, requiring periodic maintenance (rebuilding or reorganizing).
To maintain our indexes, we can use the following SQL Server commands.
We can check the sys.dm_db_index_physical_stats
system dynamic management view for the level of fragmentation:
SELECT *
FROM sys.dm_db_index_physical_stats
(DB_ID(), OBJECT_ID('Employees'), NULL, NULL, NULL);
Replace Employees
with the name of your table.
We can rebuild an index using the REBUILD
option of the ALTER INDEX
statement:
ALTER INDEX IX_Employees_Department ON Employees REBUILD;
Rebuilding an index actually drops and re-creates the index. This removes fragmentation, reclaims disk space by compacting the pages based on the specified or existing fill factor setting, and reorders the index rows in contiguous pages. We can also run this command to enable a disabled index.
If we don’t want to do a full rebuild, we can do a reorganization of the index:
ALTER INDEX IX_Employees_Name ON Employees REORGANIZE;
This is a more lightweight version that defragments the leaf level of clustered and nonclustered indexes by physically reordering the leaf-level pages to match the logical, left to right, order of the leaf nodes. It also compacts the index pages, based on the existing fill factor value.
These options will sort out any fragmentation that’s occurred with an index over time.
More on Indexes
See my article Introduction to Indexing in SQL for a basic overview of indexes in SQL databases.