What is a Superkey Key in SQL?

There are various key types in SQL, each with their own characteristics. Some of these can be enforced with a constraint of one kind or another. Others are not necessarily enforced with a constraint, but are more of an abstract notion, albeit, an important one. These key types consist of one or more columns with certain characteristics that qualify it as the particular key.

One such key type is the superkey.

A superkey in SQL is defined as a set of one or more columns (attributes) that can uniquely identify a row within a table. In simpler terms, a superkey is any combination of columns where no two rows in a table have the same values for those columns.

Characteristics of a Superkey

Here are the basic characteristics of superkeys:

  • Uniqueness: The primary characteristic of a superkey is that it must uniquely identify each row in the table. Even if a superkey consists of multiple columns, the combined values across these columns must be unique for each row.
  • Redundancy: A superkey can contain more columns than necessary to ensure uniqueness. For example, if a table has a column ID that is unique, then ID alone is a superkey. However, a combination of ID and another column, like Name, also forms a superkey, albeit with redundancy.
  • Non-minimality: Unlike a candidate key, which is a minimal superkey, a superkey can have additional columns that aren’t necessary for uniqueness. This non-minimality is what differentiates a superkey from other key types.

Example of a Superkey in SQL

Consider a table named Employees with the following columns:

  • EmployeeID (unique identifier)
  • Name
  • Email
  • PhoneNumber

Here are examples of superkeys in this table:

  1. Single Column Superkey: EmployeeID alone is a superkey because it uniquely identifies each employee.
  2. Multiple Columns Superkey: A combination of Email and PhoneNumber can also be a superkey if no two employees share the same email and phone number.
  3. Redundant Superkey: A combination of EmployeeID, Name, and Email is still a superkey, but it’s redundant because EmployeeID alone is sufficient to ensure uniqueness.

Superkey vs. Candidate Key

While every candidate key is a superkey, not every superkey is a candidate key. A candidate key is a superkey that is minimal, meaning it has no unnecessary columns. For instance, in the Employees table, EmployeeID is both a superkey and a candidate key. However, a combination of EmployeeID and Name is a superkey but not a candidate key, as Name is not required to maintain uniqueness.