How SQL Constraints Work: A Beginner’s Overview

Constraints are an important concept in relational database management systems (RDBMSs). Whenever we design a database, we need to ensure that it will be able to enforce data integrity, by checking or restricting what the user can enter to only data that conforms to the rules of the database. That’s where a constraint can help.

This article explores what SQL constraints are, the various types available, their importance, and how they are implemented.

What is a Constraint in SQL?

In SQL, a constraint is a rule that is applied to a column or a table to limit the types of data that can be inserted. Constraints help maintain the accuracy and integrity of the data by ensuring that the data entered into a table meets specific criteria. Constraints can be applied at the column level (affecting a single column) or at the table level (affecting multiple columns).

Types of SQL Constraints

There are many types of constraints, each one having their own purpose. Below are the main constraint types, along with a quick description and example of the SQL used to create each one.

  • NOT NULL Constraint
    • Ensures that a column cannot have a NULL value.
    • Example: CREATE TABLE Employees (ID int NOT NULL, Name varchar(255) NOT NULL);
  • UNIQUE Constraint
    • Ensures that all the values in a column are unique.
    • Example: CREATE TABLE Employees (ID int UNIQUE, Email varchar(255) UNIQUE);
  • PRIMARY KEY Constraint
    • Uniquely identifies each row in a table. It’s effectively a combination of NOT NULL and UNIQUE, but it’s explicitly defined as PRIMARY KEY. There can only be one primary key on a table.
    • Example: CREATE TABLE Employees (ID int PRIMARY KEY, Name varchar(255));
  • FOREIGN KEY Constraint
    • Ensures referential integrity by linking two tables together. The FOREIGN KEY in one table points to the PRIMARY KEY in another table.
    • Example: CREATE TABLE Orders (OrderID int, EmployeeID int, FOREIGN KEY (EmployeeID) REFERENCES Employees(ID));
  • CHECK Constraint
    • Ensures that all values in a column satisfy a specific condition as specified in its definition.
    • Example: CREATE TABLE Employees (ID int, Age int CHECK (Age >= 18));
  • DEFAULT Constraint
    • Sets a default value for a column if no value is specified.
    • Example: CREATE TABLE Employees (ID int, Country varchar(255) DEFAULT 'USA');
  • INDEX Constraint
    • Although not strictly a constraint, an index is often used to enforce UNIQUE and PRIMARY KEY constraints, speeding up query performance.
    • Example: CREATE UNIQUE INDEX UX_Email ON Employees(Email);

Importance of SQL Constraints

Constraints are vital for:

  • Data Integrity: Ensuring that the data stored in the database is accurate and consistent.
  • Data Validation: Preventing invalid data from being entered into the database.
  • Data Relationships: Maintaining referential integrity between tables, ensuring data remains synchronized across related tables.
  • Performance Optimization: Some constraints, such as those with an index, can improve the performance of queries.

How to Implement SQL Constraints

SQL constraints can be implemented during the creation of tables (with the CREATE TABLE statement) or they can be added to existing tables using the ALTER TABLE statement.

Adding Constraints During Table Creation

   CREATE TABLE Employees (
      ID int PRIMARY KEY,
      Name varchar(255) NOT NULL,
      Age int CHECK (Age >= 18),
      Email varchar(255) UNIQUE
   );

This table has a constraint on every column:

  • The ID column has a PRIMARY KEY constraint. This uniquely identifies the row.
  • The Name column has a NOT NULL constraint. This prevents NULL values from entering the column.
  • The Age column has a CHECK constraint. In this case we’re checking that the employee is at least 18 years old.
  • The Email column has a UNIQUE constraint. Therefore, no two employees can have the same email address.

Another way to do it is add the constraints after the column definitions:

CREATE TABLE Employees (
    ID int,
    Name varchar(255) NOT NULL,
    Age int,
    Email varchar(255),
    PRIMARY KEY (ID),
    CHECK (Age >= 18),
    UNIQUE (Email)
);

Note that NOT NULL constraints are an exception. In most (if not all) major RDBMSs, these are defined within the column’s definition and therefore we need to keep the NOT NULL part inside the column’s definition.

Constraints added after the column definitions are sometimes referred to as table constraints. One benefit of table constraints is that we can apply a single constraint across multiple tables. For example:

CREATE TABLE Employees (
    ID int,
    Name varchar(255) NOT NULL,
    Age int,
    Email varchar(255),
    PRIMARY KEY (ID),
    CHECK (Age >= 18 AND len(Email) > 4),
    UNIQUE (Email)
);

In this case I defined a table constraint like this: CHECK (Age >= 18 AND len(Email) > 4)

This constraint covers two columns; Age and Email

In practice, you would probably want to have a better check for the email format than the one listed here, but this is just a simple example for demonstration purposes. And you might prefer to keep constraints like these ones separate from each another.

Constraint Names

If we don’t provide a name for our constraints, the database engine will name them for us. In the above examples, we didn’t provide a name, and so the database engine will generate a name for them. Depending on the RDBMS, this will often be a name that contains a bunch of random characters. This can make things harder if we want to modify or drop an existing constraint, as we will need to find the name first. And if we drop a constraint and recreate it, it will probably be recreated with a different name.

For example, here’s what SQL Server named the above constraints for me:

column_name  constraint_name               
----------- ------------------------------
ID PK__Employee__3214EC272DDABBB0
Email UQ__Employee__A9D105346631E364
Age CK__Employees__5A1A5A11
Email CK__Employees__5A1A5A11

We can see that the first part of the constraint name is somewhat descriptive, but the last part appears to be just a bunch of jumbled up characters.

We’ll look at how I found out their names later, but for now, let’s see if we can make our constraint names a bit easier to read and remember.

Fortunately we have an alternative option; we can provide our own name for each constraint.

Here’s an example of creating the same constraints as above, except this time we provide our own names:

CREATE TABLE Employees (
   ID int CONSTRAINT PK_Employee PRIMARY KEY,
   Name varchar(255) NOT NULL,
   Age int CONSTRAINT CHK_Employee_Age CHECK (Age >= 18),
   Email varchar(255) CONSTRAINT UQ_Employee_Email UNIQUE (Email)
);

This was the example of including the constraints in the column definitions.

Here it is again, but with the constraints listed at the end:

CREATE TABLE Employees (
   ID int,
   Name varchar(255) NOT NULL,
   Age int,
   Email varchar(255),
   CONSTRAINT PK_Employee PRIMARY KEY (ID),
   CONSTRAINT CHK_Employee_Age CHECK (Age >= 18),
   CONSTRAINT UQ_Employee_Email UNIQUE (Email)
);

As mentioned, NOT NULL constraints are a bit of an exception, and so it remains in the column definition.

Adding Constraints to Existing Tables

We can add constraints to existing tables if required. We can do this with the ALTER TABLE statement.

For example:

ALTER TABLE Orders
ADD CONSTRAINT PK_Orders PRIMARY KEY (ID);

This example adds a PRIMARY KEY constraint called PK_Orders to the Orders table. The assumption is that we didn’t add the PRIMARY KEY when we created the table.

Here’s an example of creating a table without any constraints (except for the NOT NULL constraints), and then adding the constraints later:

-- Step 1: Create the table
CREATE TABLE Employees (
   ID int NOT NULL,
   Name varchar(255) NOT NULL,
   Age int,
   Email varchar(255),
   DepartmentID int
);

-- Step 2: Add constraints
ALTER TABLE Employees
ADD CONSTRAINT PK_Employee PRIMARY KEY (ID);

ALTER TABLE Employees
ADD CONSTRAINT UQ_Employee_Email UNIQUE (Email);

ALTER TABLE Employees
ADD CONSTRAINT CHK_Employee_Age CHECK (Age >= 18);

You’ll often see scripts like this; where the developer creates the table first (without any constraints), and then they immediately add all the constraints under the CREATE TABLE statement. So basically, creating the table, then creating all constraints for that table.

One thing you may notice is that I defined the primary key column as NOT NULL. If I hadn’t done that, I would’ve received an error when I tried to add the primary key constraint to it later (due to the column not having a NOT NULL constraint). If you’re trying to add a primary key constraint to a column that doesn’t have a NOT NULL constraint, you can always alter the existing column to be NOT NULL first:

ALTER TABLE Employees
ALTER COLUMN ID int NOT NULL;

Then the primary key constraint can be added once that’s done.

If the candidate key already contains NULL values, then you’ll need to eliminate them before adding the NOT NULL and primary key constraints.

Finding Constraint Names

Sometimes we need to make modifications to existing constraints, or perhaps we simply want to check to see what constraints are on a table, etc.

In such cases, we might like to see a list of constraints on a given table. Fortunately there are many ways of doing this. However, the available options do vary between RDBMS.

One standard way to do this is to use the information schema views. For example, we can use the information_schema.constraint_column_usage view to get a list of all constraint names on a given table:

SELECT
    column_name,
    constraint_name
FROM information_schema.constraint_column_usage
WHERE table_name = 'Employees';

Example output:

column_name  constraint_name  
----------- -----------------
ID PK_Employee
Email UQ_Employee_Email
Age CHK_Employee_Age

We can see our constraint names along with their respective columns.

Foreign Key Constraints

A foreign key constraint is used to define a relationship between two tables (or in some cases, a relationship within the same table). We normally create foreign keys when we want data in one table to match data in another for the purposes of creating a parent/child relationship. That is, we’ll have a parent table and a child table. The child table’s foreign key references the parent table’s primary key. This means that any data that goes into the child table’s foreign key column must already exist in the parent’s foreign key column.

Here’s an example:

CREATE TABLE Types( 
TypeId INTEGER PRIMARY KEY,
Type
);

CREATE TABLE Pets(
PetId INTEGER PRIMARY KEY,
PetName,
TypeId,
FOREIGN KEY(TypeId) REFERENCES Types(TypeId)
);

The part that creates the foreign key is this:

FOREIGN KEY(TypeId) REFERENCES Types(TypeId)

The FOREIGN KEY(TypeId) part declares Pets.TypeId as the foreign key.

Once these two tables have been created, whenever we insert a row into the Pets table, we must make sure that the value we specify for the TypeId column is a value that’s already in the TypeId of the Types table. If we specify a value that’s not already in the parent table, we’ll get an error. This is the whole purpose of a foreign key constraint – it prevents users from entering data that would create an orphaned record. It helps us to maintain referential integrity.

Best Practices for Using SQL Constraints

Here are some best practices we should keep in mind for constraints when designing our databases:

  • Always Define Primary Keys: In most cases, every table should have a primary key to uniquely identify records. While there can be exceptions to this rule, they are usually quite rare.
  • Use NOT NULL Where Necessary: Ensure critical fields do not accept NULL values to avoid data inconsistencies.
  • Regularly Update Constraints: Modify constraints as the database schema evolves to maintain data integrity.
  • Document Your Constraints: Keep detailed documentation of the constraints applied to each table for better database management.

Common Mistakes to Avoid

Here are some common mistakes that we should try to avoid:

  • Overusing UNIQUE Constraints: Applying UNIQUE constraints too liberally can lead to difficulties in data entry.
  • Ignoring FOREIGN KEY Constraints: Failing to define FOREIGN KEY constraints can result in orphaned records, leading to data integrity issues.
  • Neglecting DEFAULT Constraints: Not using DEFAULT constraints can lead to unexpected NULL values in your data.

Related Articles