When a foreign key is defined in a table, it ensures that the values in the column(s) correspond to values in a primary key or unique key in another table. While PostgreSQL automatically creates an index for primary keys (because these need to be fast for lookups and enforcing uniqueness), it does not automatically create an index for foreign keys.
Why Create Indexes on Foreign Keys in PostgreSQL?
Here’s why it’s usually recommended to create indexes on foreign keys:
- Optimizing Joins:
Foreign keys are frequently used inJOIN
operations when querying related data. Without an index on the foreign key column, PostgreSQL must perform a sequential scan on the table, leading to slower joins as the table size grows. - Improving Referential Integrity Checks:
When rows in the referenced table (i.e., the table with the primary key) are updated or deleted, PostgreSQL must ensure that referential integrity is maintained by checking the foreign key constraints. Without an index, PostgreSQL must perform a full scan on the table containing the foreign key, which can be slow for large tables. - Performance on Updates/Deletes:
If you performON DELETE
orON UPDATE
operations on the primary key, PostgreSQL will need to verify that no dependent records in the foreign key table are violating the constraint. Indexes speed up these checks.
Why Aren’t Indexes Automatically Created on Foreign Keys?
It might be tempting to think that it’s ridiculous that indexes aren’t automatically created on foreign keys. But here are some reasons why you might not want indexes to be automatically created on foreign keys:
- Flexibility:
Automatically indexing all foreign keys might lead to unnecessary overhead if certain foreign keys are rarely used in queries, resulting in additional maintenance costs duringINSERT
,UPDATE
, andDELETE
operations. - Developer Choice:
PostgreSQL allows developers to choose which foreign key relationships require indexing (and which type of index to use) based on the workload. This way, developers can create indexes where they know performance will benefit from them. - Indexing Overhead:
Maintaining indexes has a cost in terms of storage and write performance. By not creating indexes automatically on foreign keys, PostgreSQL avoids introducing unnecessary overhead on tables where foreign key lookups are infrequent or irrelevant.
Here’s what the PostgreSQL documentation has to say about this:
Since a
DELETE
of a row from the referenced table or anUPDATE
of a referenced column will require a scan of the referencing table for rows matching the old value, it is often a good idea to index the referencing columns too. Because this is not always needed, and there are many choices available on how to index, the declaration of a foreign key constraint does not automatically create an index on the referencing columns.
Example: Creating an Explicit Index on a Foreign Key
Let’s walk through an example where we explicitly create an index for a foreign key but leave the primary key without an explicit index (relying on PostgreSQL to automatically index the primary key).
Step 1: Create Tables and Define Foreign Key
CREATE TABLE customers (
customer_id SERIAL PRIMARY KEY,
customer_name VARCHAR(100)
);
CREATE TABLE orders (
order_id SERIAL PRIMARY KEY,
customer_id INT REFERENCES customers(customer_id),
order_date DATE
);
At this point, PostgreSQL has automatically created an index on the customer_id
column in the customers
table because it is the primary key. However, no index has been created for customer_id
in the orders
table, which is a foreign key.
Step 2: Check Indexes
We can check which columns have indexes by querying the pg_indexes
view. So let’s check that now (before we create an index for the foreign key column):
SELECT indexname, indexdef
FROM pg_indexes
WHERE tablename IN ('customers', 'orders');
Output:
indexname | indexdef
----------------+----------------------------------------------------------------------------------
customers_pkey | CREATE UNIQUE INDEX customers_pkey ON public.customers USING btree (customer_id)
orders_pkey | CREATE UNIQUE INDEX orders_pkey ON public.orders USING btree (order_id)
We can see that there are only two indexes; both of which are for the primary keys of their respective tables.
No indexes have been created for the foreign key.
Step 3: Create Index on the Foreign Key
Let’s now explicitly create an index on the foreign key (customer_id
) column in the orders
table:
CREATE INDEX idx_orders_customer_id ON orders(customer_id);
Step 4: Re-Check Indexes
Let’s re-run the pg_indexes
view:
SELECT indexname, indexdef
FROM pg_indexes
WHERE tablename IN ('customers', 'orders');
Output:
indexname | indexdef
------------------------+----------------------------------------------------------------------------------
customers_pkey | CREATE UNIQUE INDEX customers_pkey ON public.customers USING btree (customer_id)
orders_pkey | CREATE UNIQUE INDEX orders_pkey ON public.orders USING btree (order_id)
idx_orders_customer_id | CREATE INDEX idx_orders_customer_id ON public.orders USING btree (customer_id)
We can see that we now have an index on the foreign key column.
Summary
Creating an index on a foreign key in PostgreSQL is recommended for improving performance in queries that involve foreign key relationships, such as JOIN
s, and for maintaining referential integrity efficiently. However, foreign key indexes are not created automatically because not all foreign keys may need indexing, and the overhead of maintaining unnecessary indexes can be costly.
By explicitly creating indexes where needed, we ensure that our database performs well while still maintaining flexibility in how indexes are applied.