What is a Composite Index in SQL?

When looking at ways to fix slow running SQL queries, one of the first things we might do is look at what indexes are available to support the query. If none are available, then we might look at creating one. And if there is an index, perhaps it doesn’t cater for the query as well as it could.

Indexes are often created on a single column, but this isn’t the only way to create an index. We can also create indexes on multiple columns. These are typically known as composite indexes.

Understanding Composite Indexes

A composite index, also known as a multi-column index or a concatenated index, is an index that includes multiple columns from a table. Unlike a single-column index, which sorts data based on one column, a composite index creates a sorted structure using two or more columns.

When we create a composite index, the order of the columns matters. The leftmost column is the primary sort key, followed by the second column, and so on. This ordering affects how the index can be used in our queries.

The main advantage of composite indexes is their ability to speed up queries that involve multiple columns in the WHERE clause or JOIN conditions. By using a composite index, we can potentially reduce the number of disk I/O operations and improve query execution time for those queries.

Creating a Composite Index

We can create a composite index in the same way that we’d create a single-column index. The difference is that we specify multiple columns.

For example, one option is to create the composite index within our CREATE TABLE statement:

CREATE TABLE customers (
    customer_id INT NOT NULL,
    last_name   CHAR(30) NOT NULL,
    first_name  CHAR(30) NOT NULL,
    PRIMARY KEY (customer_id),
    INDEX idx_customer_name(last_name,first_name)
);

This example creates a table called customers and defines an index called idx_customer_name on the last_name and the first_name columns.

The ordering of column names is important here. Composite indexes work from the left.

So the above composite index could be useful for the following queries:

SELECT * FROM customers 
WHERE last_name='Simpson';

SELECT * FROM customers 
WHERE last_name='Simpson' AND first_name='Homer';

SELECT * FROM customers
WHERE last_name='Simpson'
AND (first_name='Homer' OR first_name='Bart');

SELECT * FROM customers
WHERE last_name='Simpson'
AND first_name >='B' AND first_name < 'J';

But it wouldn’t be used if we searched for just the first name:

SELECT * FROM customers 
WHERE first_name='Homer';

That’s because the leftmost column of the index isn’t included in the query’s search predicate. Generally, composite indexes are only used if the leftmost columns are included in the search predicate.

Another way to create a composite index is with the CREATE INDEX statement.

Let’s create a three-column index:

CREATE INDEX idx_full_name ON customers(first_name, middle_name, last_name);

The index can be used when searching the following:

SELECT * FROM customers 
WHERE first_name='Homer';

SELECT * FROM customers 
WHERE first_name='Homer' AND middle_name='Jay';

SELECT * FROM customers 
WHERE first_name='Homer' AND middle_name='Jay' AND last_name='Simpson';

But it can’t be used on the following:

SELECT * FROM customers 
WHERE middle_name='Jay';

SELECT * FROM customers 
WHERE middle_name='Jay' AND last_name='Simpson';

That’s because those columns aren’t leftmost in the index.

There can be cases where the “leftmost” rule doesn’t apply. For example, some DBMSs can use a “skip scan” method that can allow scanning an index on keys that are not the leftmost. As always, it’s best to check your DBMS’s documentation to see exactly how indexing is implemented before deciding which approach to use.

Any Downsides?

It’s important to note that composite indexes aren’t always the best solution. As with any index, they increase the storage requirements and can slow down INSERT, UPDATE, and DELETE operations. Therefore, we should carefully consider our options before implementing composite indexes.

That said, if you’re battling with a slow running query, and you know that your users are often using multiple columns in their search criteria, then a composite index could be the way to go.

Composite Primary Keys

Some DBMSs automatically create an index on primary key columns. And if not, then we have the option to do so. A primary key can also consist of multiple columns, making it a composite primary key. See my article What is a Composite Primary Key? for an example.