Indexed views can significantly enhance the performance of complex queries in SQL Server. However, they come with several prerequisites and considerations that need to be carefully understood before implementation.
In this article, I provide an overview of indexed views, discussing their utility, prerequisites, and best practices, along with performance considerations and a simple example.
Introduction to Indexed Views
In SQL Server, a view is essentially a virtual table that is derived from one or more base tables. It allows users to query results from complex queries as if they were querying a single table.
While standard user-defined views do not store data themselves (the data is pulled from the underlying tables at runtime), an indexed view is different. An indexed view physically stores the result set of the view’s query in the database, similar to how a table with a clustered index stores data.
The first time we create an indexed view, the index must be a unique clustered index. This materializes the data in the view and stores it on disk. This transformation from a virtual to a physical structure allows for significant performance improvements, especially for queries that are frequently executed with complex aggregations or joins.
Once we’ve created the unique clustered index, we can go ahead and add nonclustered indexes if required.
One cool thing about indexed views is that, while we can query the indexed view directly, this isn’t actually a requirement. The query optimizer may choose to use the indexed view even if we don’t reference it in our query. In other words, even if we only query the underlying tables directly, we can benefit from the indexed view.
Benefits of Indexed Views
Here are some of the main benefits that indexed views can offer:
- Performance Improvement: Complex queries that involve joins, aggregations, or both can be optimized with indexed views, as they reduce the need for recalculating results every time a query is executed.
- Precomputed Results: They allow for precomputing and storing results, which can then be quickly retrieved, saving significant CPU and I/O resources.
- Query Simplification: By abstracting complex logic into a view, you can simplify your query structure, making it easier to write and maintain code. Mind you, this is true of any view – indexed or not.
While the potential benefits of implementing indexed views may sound compelling, there are a few other factors that we need to take into account when considering the use of index views, as outlined in this article.
Prerequisites for Creating Indexed Views
Before we go ahead and start creating indexed views, there are some prerequisites that need to be in place first. We need to have our SET
options set correctly for starters, and even then, not all views can become indexed views. To become indexed, a view must have certain things present, but other things not present.
Let’s explore what we need to do before creating an indexed view.
SET
Options
There’s a bunch of SET
options that must be configured correctly when creating or modifying an indexed view. These options ensure that the session settings under which the view is created are consistent with those required for indexed views.
The required SET
options are as follows:
SET ANSI_NULLS ON;
SET ANSI_PADDING ON;
SET ANSI_WARNINGS ON;
SET ARITHABORT ON;
SET CONCAT_NULL_YIELDS_NULL ON;
SET QUOTED_IDENTIFIER ON;
SET NUMERIC_ROUNDABORT OFF;
These settings must be enabled before any of the following occurs:
- The view and subsequent indexes on the view are created.
- The base tables referenced in the view at the time the view is created.
- When any insert, update, or delete operation is performed on any table that participates in the indexed view. This requirement includes operations such as bulk copy, replication, and distributed queries.
- The indexed view is used by the query optimizer to produce the query plan.
We can also use a (miinor) shortcut to the above code:
SET NUMERIC_ROUNDABORT OFF;
SET ANSI_PADDING,
ANSI_WARNINGS,
CONCAT_NULL_YIELDS_NULL,
ARITHABORT,
QUOTED_IDENTIFIER,
ANSI_NULLS ON;
This saves us from repeating SET
and ON
for all options that need to be set to ON
. Instead, we simply list all those options, sandwiched between SET
and ON
.
Schema Binding
One of the primary prerequisites for creating an indexed view is that the view must be schema-bound. Schema binding ensures that the underlying tables or views cannot be altered in a way that would affect the indexed view. This means that any changes to the structure of the base tables, such as adding or dropping columns, require dropping the indexed view first.
To create a schema-bound view, you must use the WITH SCHEMABINDING
option in the CREATE VIEW
statement.
CREATE VIEW ViewName
WITH SCHEMABINDING
AS
SELECT ...
Also, schema-bound views must use two-part names (schema.object
) when referencing the base tables or user-defined functions. For example dbo.Products
.
Deterministic Functions
For a view to be indexed, its definition must be deterministic. Deterministic expressions and functions always return the same result when given the same input. Non-deterministic functions, such as GETDATE()
or NEWID()
, cannot be used in indexed views because their results can change with each execution.
All expressions in the view must be deterministic for the view to be eligible for indexing. This includes all expressions in the SELECT
list, and the WHERE
and GROUP BY
clauses. This applies not only to the functions used but also any computed columns or derived expressions.
Data Types
Not all data types are supported in indexed views. Indexed views don’t support the following data types:
text
ntext
image
- FILESTREAM
xml
float
in the clustered key (this type is supported, but it can’t be included in the clustered key)- Sparse column sets.
The text
, ntext
, and image
types can be migrated to varchar(max)
, nvarchar(max)
, and varbinary(max)
respectively. The other types don’t have an equivalent. The float
type is allowed in indexed views, but only if it’s not in the clustered key.
Computed Columns
Computed columns that use non-deterministic functions or expressions are not permitted.
If the view includes computed columns, the expressions or functions used in those computed columns must inherently be deterministic. They must also adhere to the data type restrictions.
GROUP BY
Clause
If the view definition contains the GROUP BY
clause, it must also have COUNT_BIG(*)
. And it must be the asterisk wildcard (*
); not a column name. There can’t be any HAVING
clause either.
You can see an example of this in my example below, where I include COUNT_BIG(*)
in the SELECT
list, due to using the GROUP_BY()
clause.
These restrictions only apply to the view definition; they don’t apply to any queries that use an indexed view in its execution plan.
Also, if the view contains a GROUP BY
clause, the key of the unique clustered index can reference only columns specified in the GROUP BY
clause.
Oh, and in case you’re wondering, we can’t use COUNT()
in indexed views; only COUNT_BIG()
.
Other Requirements
Here are some other requirements for indexed views:
- Indexed views can only reference base tables in the same database.
- Indexed views cannot reference other views.
- The index can only be created by the view owner.
- CLR functions have specific requirements (see Microsoft link below).
- The
SELECT
statement in the view definition must not contain certain T-SQL syntax (see the Microsoft link below).
There’s also a bunch of other requirements/restrictions listed on the Microsoft website. Take a look through that list at some stage to see if anything stands out for your particular scenario. Look for the table that suggests alternatives for some of the unsupported syntax. For example, indexed views don’t support SELECT DISTINCT
, but you can get the same effect with GROUP BY
(along with COUNT_BIG(*)
as mentioned above).
In the meantime, feel free to continue on and create the following (simple) indexed view, which satisfies all requirements.
Create an Indexed View: A Step-by-Step Example
Let’s go through an example of creating an indexed view step by step.
Step 1: Configure Our SET
Options
As outlined above, our SET
options must be configured a certain way before we create an indexed view.
Here’s how we can do that:
SET NUMERIC_ROUNDABORT OFF;
SET ANSI_PADDING,
ANSI_WARNINGS,
CONCAT_NULL_YIELDS_NULL,
ARITHABORT,
QUOTED_IDENTIFIER,
ANSI_NULLS ON;
If you want to check your current settings, here’s how to check your SET
options.
Step 2: Create the Base Tables
Now that we’ve got our SET
options configured, let’s create the base tables:
CREATE TABLE Sales (
SalesID INT PRIMARY KEY,
ProductID INT NOT NULL,
SaleDate DATETIME NOT NULL,
Quantity INT NOT NULL,
Price MONEY NOT NULL
);
CREATE TABLE Products (
ProductID INT PRIMARY KEY,
ProductName NVARCHAR(100) NOT NULL,
CategoryID INT NOT NULL
);
Step 3: Create the View with Schema Binding
As mentioned, our view must be schema-bound before we try to add an index to it. We can ensure this is the case by including WITH SCHEMABINDING
when creating it:
CREATE VIEW SalesSummary
WITH SCHEMABINDING
AS
SELECT
P.ProductID,
P.ProductName,
SUM(S.Quantity) AS TotalQuantity,
SUM(S.Quantity * S.Price) AS TotalSales,
COUNT_BIG(*) AS [RowCount]
FROM
dbo.Sales S
JOIN
dbo.Products P
ON
S.ProductID = P.ProductID
GROUP BY
P.ProductName, P.ProductID;
Step 4: Create a Unique Clustered Index on the View
Now comes the time to create the actual index.
To create an indexed view, we must first create a unique clustered index on the view. The index ensures the materialization of the view’s result set:
CREATE UNIQUE CLUSTERED INDEX IDX_SalesSummary_ProductID ON SalesSummary (ProductID);
That creates a unique clustered index on the ProductID
column. It must be a column that only accepts unique values. In this case it’s the primary key column for the underlying table, and so it’s guaranteed to have unique values.
Once the index is created, the view is persisted, and its data is stored physically on disk, allowing for faster query performance.
Step 5: Create a Nonclustered Index on the View
Now that we’ve created a unique clustered index on the view, we can go ahead and add nonclustered indexes:
CREATE NONCLUSTERED INDEX IDX_SalesSummary_ProductName ON SalesSummary (ProductName);
That’s a nonclustered index on the ProductName
column.
Now that we’ve created our indexed view, let’s go ahead and insert some dummy data (optional) and then query the view.
Insert Dummy Data (optional)
I used the following scripts to insert a thousand rows into each table. You can easily adjust the number of rows inserted by changing the value in the TOP()
clause. You might want to insert more rows in the Sales
table compared to the Products
table to make it more realistic.
This step is optional for the purposes of this article, as we’re only going to be looking at the query plan. It’s still possible to check the query plan without any data, but having data will provide a more realistic indicator (such as how many rows were read).
Anyway, here are the scripts:
-- Insert products into the Products table
INSERT INTO Products (ProductID, ProductName, CategoryID)
SELECT TOP (1000)
ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS ProductID,
CONCAT('Product-', ROW_NUMBER() OVER (ORDER BY (SELECT NULL))) AS ProductName,
ABS(CHECKSUM(NEWID())) % 10 + 1 AS CategoryID
FROM sys.columns AS sc1
CROSS JOIN sys.columns AS sc2;
-- Insert sales records into the Sales table
INSERT INTO Sales (SalesID, ProductID, SaleDate, Quantity, Price)
SELECT TOP (1000)
ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS SalesID,
ABS(CHECKSUM(NEWID())) % 10000 + 1 AS ProductID, -- Random product IDs from 1 to 10,000
DATEADD(DAY, ABS(CHECKSUM(NEWID())) % 365, '2023-01-01') AS SaleDate, -- Random dates in 2023
ABS(CHECKSUM(NEWID())) % 10 + 1 AS Quantity, -- Random quantities between 1 and 10
ABS(CHECKSUM(NEWID())) % 500 + 100 AS Price -- Random prices between 100 and 600
FROM sys.columns AS sc1
CROSS JOIN sys.columns AS sc2;
Querying the Indexed Views
We can now query the indexed views as we would any other table or view, benefiting from the performance improvements provided by the precomputed and stored data.
Let’s include the WITH (NO EXPAND)
option to force the query optimizer to use the index (more on this later).
Using the Clustered Index
Here’s a query that uses the clustered index.
SELECT * FROM SalesSummary WITH (NOEXPAND);
Here’s what the estimated query plan looks like for the above query:
Here, it used a clustered index scan.
While an index scan is often better than a full table scan, it still requires a full scan of the index, retrieving all rows. That said, our query asked for all rows, so what did we expect?
Let’s add a WHERE
clause so that our query becomes more selective:
SELECT * FROM SalesSummary WITH (NOEXPAND) WHERE ProductId = 570;
And here’s the query plan output:
This time we got a clustered index seek, which is great! We can see that only one row would be read, compared to 96 rows with the previous query. This is based on my small amount of data (1000 rows in each table). This effect could be much more pronounced on tables with a lot of data.
And if we remove WITH (NOEXPAND)
, here’s what it looks like:
SELECT * FROM SalesSummary;
Query plan output:
This time we can see that the view’s index wasn’t used at all. In fact, this is the exact query plan I get when I run the view’s contents directly (i.e. I copy and paste the SELECT
statement from the view and run that).
So we can see that, even when we have an indexed view, it doesn’t mean that the index will always be used. And even if it is used, whether or not it uses a scan or a seek will depend on the query.
Using the Nonclustered Index
Here’s a query that uses the nonclustered index.
SELECT * FROM SalesSummary WITH (NOEXPAND);
Here’s what the query plan looks like for the above query:
We get an index scan, due to the non-selective nature of the query (it’s asking for all rows).
This time it scans the nonclustered index. That’s because we’re only asking for the ProductName
column, and that’s what the nonclustered index has indexed.
Let’s modify the query so that it’s more selective:
SELECT COUNT(ProductName)
FROM SalesSummary WITH (NOEXPAND)
WHERE ProductName LIKE 'Beer%';
Now let’s look at the query plan:
This time we get an index seek.
But things can change very quickly, even if we modify the query ever so slightly:
SELECT COUNT(ProductName)
FROM SalesSummary WITH (NOEXPAND)
WHERE ProductName LIKE '%Beer%';
Query plan:
This time we get an index scan instead of a seek. And all we did was add a wildcard to the front of Beer
, so that it became %Beer%
.
When we used Beer%
, the query was sargable. But when we used %Beer%
, the query wasn’t sargable.
Here’s how it goes:
- When we use
Beer%
, SQL Server can jump straight to the letter B, search through everything that starts with B, and then return the results. - When we use
%Beer%
on the other hand, SQL Server can’t do that. After all, what if there’s a product called say,Wonky Beer
? That wouldn’t turn up if SQL Server only looked for products that start with B. So instead, SQL Server has to search through every row, just in case there’s a product with a name that doesn’t start with Beer.
It’s also worth mentioning that, if the query needs columns that are not in the nonclustered index, then the database engine will need to do a key lookup to get the remaining columns. For example, if we added the TotalQuantity
column to our SELECT
list, then it would need to do a key lookup, because that column isn’t in our nonclustered index (only the ProductName
column is in our nonclustered index). This is not the most desirable situation, ideally the index would contain all the columns the query optimizer needs to return the result. In that case, our query should be faster than one that has to do a key lookup.
So, to get the most out of indexed views (and indexes in general), be sure to include all columns that the query requires in the index.
About the WITH (NOEXPAND)
Option
Typically, the query optimizer will only use the indexed view if it determines that it’s beneficial to do so. But we can use the WITH (NOEXPAND)
table hint to essentially force the query optimizer to use the indexed view.
WITH (NOEXPAND)
specifies that any indexed views aren’t expanded to access underlying tables when the query optimizer processes the query. The query optimizer treats the view like a table with clustered index.
I used this table hint in the examples above. If I hadn’t done that, the query optimizer might have chosen not to use the indexed view, and instead accessed the underlying tables as if there wasn’t an index on the view. Actually, I did try it without NOEXPAND
and that’s exactly what it did (it ignored the index).
Here’s one of those queries again:
SELECT COUNT(ProductName)
FROM SalesSummary WITH (NOEXPAND)
WHERE ProductName LIKE 'Beer%';
We can see the WITH (NOEXPAND)
part included in the FROM
clause.
We can also explicitly specify an index to use in order to force it to use a particular index:
SELECT COUNT(ProductName)
FROM SalesSummary WITH (NOEXPAND INDEX (IDX_SalesSummary_ProductName))
WHERE ProductName LIKE 'Beer%';
In this case, we use NOEXPAND INDEX
followed by the index name in parentheses.
Some Considerations of NOEXPAND
Here are a few points to be aware of:
NOEXPAND
only applies to indexed views.NOEXPAND
can only be used if the view is also named in the query.- The query optimizer considers using indexed views, even if they aren’t referenced directly in the query.
- The database engine will only automatically create statistics on an indexed view when
NOEXPAND
is used. Omitting this hint can lead to execution plan warnings about missing statistics that can’t be resolved by creating statistics manually.
While the NOEXPAND
option may seem like a no brainer, Microsoft recommends that we limit our use of hints like EXPAND
and NOEXPAND
to specific cases where testing has shown that they improve performance significantly.
Prior to SQL Server 2016 (13.x) Service Pack 1, automatic use of an indexed view by the query optimizer is supported only in specific editions of SQL Server. However, all later editions support automatic use of an indexed view. So if you’re on an earlier edition, you may need to use NOEXPAND
if you want any hope of an indexed view being used.
When Does the Query Optimizer Use an Indexed View?
As mentioned, the query optimizer will only use the indexed view if it determines that it’s beneficial to do so.
In particular, the query optimizer uses an indexed view when the following conditions are met:
- The query optimizer finds a match between the view index columns and elements in the query, such as the following:
- Search condition predicates in the
WHERE
clause - Join operations
- Aggregate functions
GROUP BY
clauses- Table references
- Search condition predicates in the
- The estimated cost for using the index has the lowest cost of any access mechanisms considered by the query optimizer.
- Every table referenced in the query (either directly, or by expanding a view to access its underlying tables) that corresponds to a table reference in the indexed view must have the same set of hints applied on it in the query.
- The
SET
options need to be configured as specified earlier in this article.
See the query processing architecture guide on Microsoft’s website for more.
Considerations for datetime
and smalldatetime
Data Types
When working with indexed views, expressions that involve implicit conversion of character strings to datetime
or smalldatetime
are considered nondeterministic. If the indexed view refers to datetime
or smalldatetime
string literals, you should explicitly convert the literal to the date type you want by using CAST()
or CONVERT()
with a deterministic date format style.
The nondeterministic styles are:
- All styles below 100 (except for 20 and 21)
- 106
- 107
- 109
- 113
- 130
So avoid those styles when converting.
See Microsoft’s article Nondeterministic conversion of literal date strings into DATE values for more information.
Performance Considerations
Query Performance
Indexed views can dramatically improve query performance by reducing the computational load on the server. Instead of recalculating the results of complex queries each time they are executed, the precomputed results stored in the indexed view can be quickly retrieved.
However, this performance benefit comes at the cost of additional storage and maintenance overhead.
Storage and Maintenance Costs
It pays to be aware of the impact that indexed views have on storage and maintenance within our databases. In particular:
- Storage: Indexed views consume additional disk space because they store the result set of the view’s query. The size of the indexed view depends on the complexity of the view and the volume of data in the underlying tables.
- Maintenance: Whenever data in the base tables is modified, SQL Server must update the indexed view to reflect those changes. This can lead to increased I/O and CPU usage, especially if the indexed view is based on large or frequently updated tables.
Index Maintenance and Updatability
Maintaining the clustered index on the view requires SQL Server to perform additional operations during insert, update, and delete actions on the base tables. This can introduce latency and reduce the overall performance of write operations.
You’ll need to weigh the benefits of faster read operations against the potential costs of slower write operations when deciding whether to implement an indexed view.
Other Considerations
Here are some other things you might want to consider when deciding whether or not to introduce indexed views into your database.
Indexed Views and Aggregations
Indexed views are particularly effective when used with aggregations. Aggregating data across large datasets can be time-consuming and resource-intensive, but by persisting the aggregated results, indexed views can significantly reduce the time required to execute queries that involve these aggregations.
Usage in OLTP vs. OLAP Environments
Whether to implement indexed views will often be influenced by the environment. For example:
- OLTP (Online Transaction Processing): In OLTP environments, where there are frequent inserts, updates, and deletes, the maintenance cost of indexed views can sometimes outweigh the performance benefits. But such environments can also have a lot of reads too, which will weigh in favor of indexed views. Careful consideration is needed before implementing indexed views in such environments.
- OLAP (Online Analytical Processing): In OLAP environments, where data is typically read more than it is written, and where queries are often large and complex, indexed views can provide substantial performance improvements.
Modifying the Base Tables
You’ll also need to be aware of the following before modifying any base tables referenced by the view:
- Restrictions on Modifications: Once a view is indexed, certain modifications to the underlying tables (e.g., altering columns, dropping tables) may require the indexed view to be dropped and recreated.
- Schema Changes: As the view is schema-bound, any changes to the schema of the base tables (like renaming or altering columns) necessitate the removal and re-creation of the view.
Conclusion
Indexed views in SQL Server provide a great option for optimizing complex queries, especially those involving heavy aggregations or joins. By storing precomputed results, they can significantly enhance query performance, but they also introduce additional storage and maintenance requirements.
They do require a bit more thought and planning before setting them up (such as schema binding, deterministic functions, specific SET
options, etc), but the performance benefits could make this very worthwhile. That said, there are cases where they can negatively affect performance (such as when lots of indexed views, or complex indexed views, refer to tables with lots of DML activity).
If you haven’t experimented with indexed views yet, maybe now’s the time to do so.