How to Find the Optimal Unique Identifier in a Table in SQL Server: sp_special_columns

In SQL Server, you can use the sp_special_columns system stored procedure to identify a unique identifier for the table. Specifically, it returns the optimal set of columns that uniquely identify a row in the table. It also returns columns automatically updated when any value in the row is updated by a transaction.

sp_special_columns is equivalent to SQLSpecialColumns in ODBC.

If there are no columns that can uniquely identify the table, the result set is empty.

Syntax

The syntax goes like this:

sp_special_columns [ @table_name = ] 'table_name'     
     [ , [ @table_owner = ] 'table_owner' ]   
     [ , [ @qualifier = ] 'qualifier' ]   
     [ , [ @col_type = ] 'col_type' ]   
     [ , [ @scope = ] 'scope' ]  
     [ , [ @nullable = ] 'nullable' ]   
     [ , [ @ODBCVer = ] 'ODBCVer' ]   
[ ; ]

The @table_name argument is required. The others are optional. See the Microsoft documentation for a detailed explanation of each argument.

Example 1 – Primary Key Column

Here’s a basic example against a table with a primary key column called PersonId:

EXEC sp_special_columns Person;

It can also be run like this:

EXEC sp_special_columns @table_name = 'Person';

Result:

+---------+---------------+-------------+-------------+-------------+----------+---------+-----------------+
| SCOPE   | COLUMN_NAME   | DATA_TYPE   | TYPE_NAME   | PRECISION   | LENGTH   | SCALE   | PSEUDO_COLUMN   |
|---------+---------------+-------------+-------------+-------------+----------+---------+-----------------|
| 1       | PersonId      | 4           | int         | 10          | 4        | 0       | 1               |
+---------+---------------+-------------+-------------+-------------+----------+---------+-----------------+

In this case, the primary key column is returned. I know this is the primary key column, because I created the table with the following code:

CREATE TABLE Person (
  PersonId int primary key, 
  PersonName varchar(500)
  );

So it seems that the stored procedure did in fact return the optimal column that uniquely identifies this table.

Example 2 – UNIQUE Column

The table in this example doesn’t have a primary key, but it does have a UNIQUE constraint.

Here’s the code used to create the table:

CREATE TABLE Event (
  EventId int UNIQUE, 
  EventName varchar(500)
  );

So let’s now execute sp_special_columns against that table:

EXEC sp_special_columns Event;

Result:

+---------+---------------+-------------+-------------+-------------+----------+---------+-----------------+
| SCOPE   | COLUMN_NAME   | DATA_TYPE   | TYPE_NAME   | PRECISION   | LENGTH   | SCALE   | PSEUDO_COLUMN   |
|---------+---------------+-------------+-------------+-------------+----------+---------+-----------------|
| 1       | EventId       | 4           | int         | 10          | 4        | 0       | 1               |
+---------+---------------+-------------+-------------+-------------+----------+---------+-----------------+

In this case, the column with the UNIQUE constraint is deemed to be the optimal unique identifier.

However, this doesn’t necessarily mean that any column constrained by a UNIQUE constraint will automatically qualify as a unique identifier. The result can depend on how null values are treated.

Example 3 – The @nullable Argument

You can use the @nullable argument to specify whether the special columns can accept a null value.

Here, I run the same code again, except this time I use @nullable = 'O'.

EXEC sp_special_columns 
  Event, 
  @nullable = 'O';

Result:

(0 rows affected)

Here it is using @nullable = 'U'

EXEC sp_special_columns 
  Event, 
  @nullable = 'U';

Result:

+---------+---------------+-------------+-------------+-------------+----------+---------+-----------------+
| SCOPE   | COLUMN_NAME   | DATA_TYPE   | TYPE_NAME   | PRECISION   | LENGTH   | SCALE   | PSEUDO_COLUMN   |
|---------+---------------+-------------+-------------+-------------+----------+---------+-----------------|
| 1       | EventId       | 4           | int         | 10          | 4        | 0       | 1               |
+---------+---------------+-------------+-------------+-------------+----------+---------+-----------------+

O specifies special columns that do not allow null values. U specifies columns that are partially nullable. U is the default value.

Here’s what happens if I create the column as NOT NULL:

DROP TABLE Event;

CREATE TABLE Event (
  EventId int NOT NULL UNIQUE, 
  EventName varchar(500)
  );

EXEC sp_special_columns 
  Event, 
  @nullable = 'U';

EXEC sp_special_columns 
  Event, 
  @nullable = 'O';

Result:

+---------+---------------+-------------+-------------+-------------+----------+---------+-----------------+
| SCOPE   | COLUMN_NAME   | DATA_TYPE   | TYPE_NAME   | PRECISION   | LENGTH   | SCALE   | PSEUDO_COLUMN   |
|---------+---------------+-------------+-------------+-------------+----------+---------+-----------------|
| 1       | EventId       | 4           | int         | 10          | 4        | 0       | 1               |
+---------+---------------+-------------+-------------+-------------+----------+---------+-----------------+
(1 row affected)
+---------+---------------+-------------+-------------+-------------+----------+---------+-----------------+
| SCOPE   | COLUMN_NAME   | DATA_TYPE   | TYPE_NAME   | PRECISION   | LENGTH   | SCALE   | PSEUDO_COLUMN   |
|---------+---------------+-------------+-------------+-------------+----------+---------+-----------------|
| 1       | EventId       | 4           | int         | 10          | 4        | 0       | 1               |
+---------+---------------+-------------+-------------+-------------+----------+---------+-----------------+
(1 row affected)

This time both O and U produced the same result.

If you have a table with multiple UNIQUE constraint columns, and some allow null values while others don’t, this argument can have an impact on which one is deemed to be the optimal unique identifier. See Example 7 at the bottom of this article for an example of what I mean.

Example 4 – IDENTITY Column

The table in this example doesn’t have a primary key or a UNIQUE constraint, but it does have an IDENTITY column.

Here’s the code used to create the table:

CREATE TABLE Product (
  ProductId int IDENTITY, 
  ProductName varchar(500)
  );

So let’s now execute sp_special_columns against that table:

EXEC sp_special_columns Product;

Result:

(0 rows affected)

So it seems that IDENTITY is not enough to uniquely identity this table.

Example 5 – Multicolumn Primary Key

Here’s one with a multicolumn primary key. In this case two columns are used for the primary key.

Here’s the code used to create the table:

CREATE TABLE PersonProduct (
  PersonId int, 
  ProductId int,
   CONSTRAINT PK_PersonProduct PRIMARY KEY (PersonId, ProductId)
  ); 

So let’s now execute sp_special_columns against that table:

EXEC sp_special_columns PersonProduct;

Result:

+---------+---------------+-------------+-------------+-------------+----------+---------+-----------------+
| SCOPE   | COLUMN_NAME   | DATA_TYPE   | TYPE_NAME   | PRECISION   | LENGTH   | SCALE   | PSEUDO_COLUMN   |
|---------+---------------+-------------+-------------+-------------+----------+---------+-----------------|
| 1       | PersonId      | 4           | int         | 10          | 4        | 0       | 1               |
| 1       | ProductId     | 4           | int         | 10          | 4        | 0       | 1               |
+---------+---------------+-------------+-------------+-------------+----------+---------+-----------------+

Example 6 – Primary Key and UNIQUE Constraint

What if there’s a primary key and a UNIQUE constraint in the same table?

Let’s find out:

CREATE TABLE PersonEvent (
  PersonEventId int UNIQUE,
  PersonId int, 
  EventId int,
   CONSTRAINT PK_PersonEvent PRIMARY KEY (PersonId, EventId)
  );

Execute sp_special_columns against that table:

EXEC sp_special_columns PersonEvent;

Result:

+---------+---------------+-------------+-------------+-------------+----------+---------+-----------------+
| SCOPE   | COLUMN_NAME   | DATA_TYPE   | TYPE_NAME   | PRECISION   | LENGTH   | SCALE   | PSEUDO_COLUMN   |
|---------+---------------+-------------+-------------+-------------+----------+---------+-----------------|
| 1       | PersonId      | 4           | int         | 10          | 4        | 0       | 1               |
| 1       | EventId       | 4           | int         | 10          | 4        | 0       | 1               |
+---------+---------------+-------------+-------------+-------------+----------+---------+-----------------+

The primary key won.

What if we switch the primary key and the UNIQUE key columns around?

OK, let’s create another whole table just for that:

CREATE TABLE PersonEvent2 (
  PersonEventId int PRIMARY KEY,
  PersonId int UNIQUE, 
  EventId int UNIQUE
  ); 

Execute sp_special_columns against that table:

EXEC sp_special_columns PersonEvent2;

Result:

+---------+---------------+-------------+-------------+-------------+----------+---------+-----------------+
| SCOPE   | COLUMN_NAME   | DATA_TYPE   | TYPE_NAME   | PRECISION   | LENGTH   | SCALE   | PSEUDO_COLUMN   |
|---------+---------------+-------------+-------------+-------------+----------+---------+-----------------|
| 1       | PersonEventId | 4           | int         | 10          | 4        | 0       | 1               |
+---------+---------------+-------------+-------------+-------------+----------+---------+-----------------+

So the primary key won out again.

Example 7 – Lots of UNIQUE Constraints

What if every column has a UNIQUE constraint?

CREATE TABLE Event2 (
  EventId int UNIQUE, 
  EventName varchar(500) UNIQUE,
  StartDate date UNIQUE,
  EndDate date UNIQUE
  );

Execute sp_special_columns against that table:

EXEC sp_special_columns Event2;

Result:

+---------+---------------+-------------+-------------+-------------+----------+---------+-----------------+
| SCOPE   | COLUMN_NAME   | DATA_TYPE   | TYPE_NAME   | PRECISION   | LENGTH   | SCALE   | PSEUDO_COLUMN   |
|---------+---------------+-------------+-------------+-------------+----------+---------+-----------------|
| 1       | EndDate       | -9          | date        | 10          | 20       | NULL    | 1               |
+---------+---------------+-------------+-------------+-------------+----------+---------+-----------------+

But let’s see what happens if we set one of those columns to NOT NULL, then use @nullable = 'O':

DROP TABLE Event2;

CREATE TABLE Event2 (
  EventId int NOT NULL UNIQUE, 
  EventName varchar(500) UNIQUE,
  StartDate date UNIQUE,
  EndDate date UNIQUE
  );

Execute sp_special_columns with @nullable = 'O':

EXEC sp_special_columns 
  Event2,
  @nullable = 'O'; 

Result:

+---------+---------------+-------------+-------------+-------------+----------+---------+-----------------+
| SCOPE   | COLUMN_NAME   | DATA_TYPE   | TYPE_NAME   | PRECISION   | LENGTH   | SCALE   | PSEUDO_COLUMN   |
|---------+---------------+-------------+-------------+-------------+----------+---------+-----------------|
| 1       | EventId       | 4           | int         | 10          | 4        | 0       | 1               |
+---------+---------------+-------------+-------------+-------------+----------+---------+-----------------+

So the “not nullable” column is now chosen as the optimal unique identifier.

Now let’s execute sp_special_columns with @nullable = 'U':

EXEC sp_special_columns 
  Event2,
  @nullable = 'U';

Result:

+---------+---------------+-------------+-------------+-------------+----------+---------+-----------------+
| SCOPE   | COLUMN_NAME   | DATA_TYPE   | TYPE_NAME   | PRECISION   | LENGTH   | SCALE   | PSEUDO_COLUMN   |
|---------+---------------+-------------+-------------+-------------+----------+---------+-----------------|
| 1       | EndDate       | -9          | date        | 10          | 20       | NULL    | 1               |
+---------+---------------+-------------+-------------+-------------+----------+---------+-----------------+

It’s now back to the previous column.