Constraints are an important concept in relational database management systems (RDBMSs). Whenever we design a database, we need to ensure that it will be able to enforce data integrity, by checking or restricting what the user can enter to only data that conforms to the rules of the database. That’s where a constraint can help.
This article explores what SQL constraints are, the various types available, their importance, and how they are implemented.
What is a Constraint in SQL?
In SQL, a constraint is a rule that is applied to a column or a table to limit the types of data that can be inserted. Constraints help maintain the accuracy and integrity of the data by ensuring that the data entered into a table meets specific criteria. Constraints can be applied at the column level (affecting a single column) or at the table level (affecting multiple columns).
Types of SQL Constraints
There are many types of constraints, each one having their own purpose. Below are the main constraint types, along with a quick description and example of the SQL used to create each one.
NOT NULL
Constraint- Ensures that a column cannot have a NULL value.
- Example:
CREATE TABLE Employees (ID int NOT NULL, Name varchar(255) NOT NULL);
UNIQUE
Constraint- Ensures that all the values in a column are unique.
- Example:
CREATE TABLE Employees (ID int UNIQUE, Email varchar(255) UNIQUE);
PRIMARY KEY
Constraint- Uniquely identifies each row in a table. It’s effectively a combination of
NOT NULL
andUNIQUE
, but it’s explicitly defined asPRIMARY KEY
. There can only be one primary key on a table. - Example:
CREATE TABLE Employees (ID int PRIMARY KEY, Name varchar(255));
- Uniquely identifies each row in a table. It’s effectively a combination of
FOREIGN KEY
Constraint- Ensures referential integrity by linking two tables together. The
FOREIGN KEY
in one table points to thePRIMARY KEY
in another table. - Example:
CREATE TABLE Orders (OrderID int, EmployeeID int, FOREIGN KEY (EmployeeID) REFERENCES Employees(ID));
- Ensures referential integrity by linking two tables together. The
CHECK
Constraint- Ensures that all values in a column satisfy a specific condition as specified in its definition.
- Example:
CREATE TABLE Employees (ID int, Age int CHECK (Age >= 18));
DEFAULT
Constraint- Sets a default value for a column if no value is specified.
- Example:
CREATE TABLE Employees (ID int, Country varchar(255) DEFAULT 'USA');
INDEX
Constraint- Although not strictly a constraint, an index is often used to enforce
UNIQUE
andPRIMARY KEY
constraints, speeding up query performance. - Example:
CREATE UNIQUE INDEX UX_Email ON Employees(Email);
- Although not strictly a constraint, an index is often used to enforce
Importance of SQL Constraints
Constraints are vital for:
- Data Integrity: Ensuring that the data stored in the database is accurate and consistent.
- Data Validation: Preventing invalid data from being entered into the database.
- Data Relationships: Maintaining referential integrity between tables, ensuring data remains synchronized across related tables.
- Performance Optimization: Some constraints, such as those with an index, can improve the performance of queries.
How to Implement SQL Constraints
SQL constraints can be implemented during the creation of tables (with the CREATE TABLE
statement) or they can be added to existing tables using the ALTER TABLE
statement.
Adding Constraints During Table Creation
CREATE TABLE Employees (
ID int PRIMARY KEY,
Name varchar(255) NOT NULL,
Age int CHECK (Age >= 18),
Email varchar(255) UNIQUE
);
This table has a constraint on every column:
- The
ID
column has aPRIMARY KEY
constraint. This uniquely identifies the row. - The
Name
column has aNOT NULL
constraint. This prevents NULL values from entering the column. - The
Age
column has aCHECK
constraint. In this case we’re checking that the employee is at least 18 years old. - The
Email
column has aUNIQUE
constraint. Therefore, no two employees can have the same email address.
Another way to do it is add the constraints after the column definitions:
CREATE TABLE Employees (
ID int,
Name varchar(255) NOT NULL,
Age int,
Email varchar(255),
PRIMARY KEY (ID),
CHECK (Age >= 18),
UNIQUE (Email)
);
Note that NOT NULL
constraints are an exception. In most (if not all) major RDBMSs, these are defined within the column’s definition and therefore we need to keep the NOT NULL
part inside the column’s definition.
Constraints added after the column definitions are sometimes referred to as table constraints. One benefit of table constraints is that we can apply a single constraint across multiple tables. For example:
CREATE TABLE Employees (
ID int,
Name varchar(255) NOT NULL,
Age int,
Email varchar(255),
PRIMARY KEY (ID),
CHECK (Age >= 18 AND len(Email) > 4),
UNIQUE (Email)
);
In this case I defined a table constraint like this: CHECK (Age >= 18 AND len(Email) > 4)
This constraint covers two columns; Age
and Email
In practice, you would probably want to have a better check for the email format than the one listed here, but this is just a simple example for demonstration purposes. And you might prefer to keep constraints like these ones separate from each another.
Constraint Names
If we don’t provide a name for our constraints, the database engine will name them for us. In the above examples, we didn’t provide a name, and so the database engine will generate a name for them. Depending on the RDBMS, this will often be a name that contains a bunch of random characters. This can make things harder if we want to modify or drop an existing constraint, as we will need to find the name first. And if we drop a constraint and recreate it, it will probably be recreated with a different name.
For example, here’s what SQL Server named the above constraints for me:
column_name constraint_name
----------- ------------------------------
ID PK__Employee__3214EC272DDABBB0
Email UQ__Employee__A9D105346631E364
Age CK__Employees__5A1A5A11
Email CK__Employees__5A1A5A11
We can see that the first part of the constraint name is somewhat descriptive, but the last part appears to be just a bunch of jumbled up characters.
We’ll look at how I found out their names later, but for now, let’s see if we can make our constraint names a bit easier to read and remember.
Fortunately we have an alternative option; we can provide our own name for each constraint.
Here’s an example of creating the same constraints as above, except this time we provide our own names:
CREATE TABLE Employees (
ID int CONSTRAINT PK_Employee PRIMARY KEY,
Name varchar(255) NOT NULL,
Age int CONSTRAINT CHK_Employee_Age CHECK (Age >= 18),
Email varchar(255) CONSTRAINT UQ_Employee_Email UNIQUE (Email)
);
This was the example of including the constraints in the column definitions.
Here it is again, but with the constraints listed at the end:
CREATE TABLE Employees (
ID int,
Name varchar(255) NOT NULL,
Age int,
Email varchar(255),
CONSTRAINT PK_Employee PRIMARY KEY (ID),
CONSTRAINT CHK_Employee_Age CHECK (Age >= 18),
CONSTRAINT UQ_Employee_Email UNIQUE (Email)
);
As mentioned, NOT NULL
constraints are a bit of an exception, and so it remains in the column definition.
Adding Constraints to Existing Tables
We can add constraints to existing tables if required. We can do this with the ALTER TABLE
statement.
For example:
ALTER TABLE Orders
ADD CONSTRAINT PK_Orders PRIMARY KEY (ID);
This example adds a PRIMARY KEY
constraint called PK_Orders
to the Orders
table. The assumption is that we didn’t add the PRIMARY KEY
when we created the table.
Here’s an example of creating a table without any constraints (except for the NOT NULL
constraints), and then adding the constraints later:
-- Step 1: Create the table
CREATE TABLE Employees (
ID int NOT NULL,
Name varchar(255) NOT NULL,
Age int,
Email varchar(255),
DepartmentID int
);
-- Step 2: Add constraints
ALTER TABLE Employees
ADD CONSTRAINT PK_Employee PRIMARY KEY (ID);
ALTER TABLE Employees
ADD CONSTRAINT UQ_Employee_Email UNIQUE (Email);
ALTER TABLE Employees
ADD CONSTRAINT CHK_Employee_Age CHECK (Age >= 18);
You’ll often see scripts like this; where the developer creates the table first (without any constraints), and then they immediately add all the constraints under the CREATE TABLE
statement. So basically, creating the table, then creating all constraints for that table.
One thing you may notice is that I defined the primary key column as NOT NULL
. If I hadn’t done that, I would’ve received an error when I tried to add the primary key constraint to it later (due to the column not having a NOT NULL
constraint). If you’re trying to add a primary key constraint to a column that doesn’t have a NOT NULL
constraint, you can always alter the existing column to be NOT NULL
first:
ALTER TABLE Employees
ALTER COLUMN ID int NOT NULL;
Then the primary key constraint can be added once that’s done.
If the candidate key already contains NULL values, then you’ll need to eliminate them before adding the NOT NULL
and primary key constraints.
Finding Constraint Names
Sometimes we need to make modifications to existing constraints, or perhaps we simply want to check to see what constraints are on a table, etc.
In such cases, we might like to see a list of constraints on a given table. Fortunately there are many ways of doing this. However, the available options do vary between RDBMS.
One standard way to do this is to use the information schema views. For example, we can use the information_schema.constraint_column_usage
view to get a list of all constraint names on a given table:
SELECT
column_name,
constraint_name
FROM information_schema.constraint_column_usage
WHERE table_name = 'Employees';
Example output:
column_name constraint_name
----------- -----------------
ID PK_Employee
Email UQ_Employee_Email
Age CHK_Employee_Age
We can see our constraint names along with their respective columns.
Foreign Key Constraints
A foreign key constraint is used to define a relationship between two tables (or in some cases, a relationship within the same table). We normally create foreign keys when we want data in one table to match data in another for the purposes of creating a parent/child relationship. That is, we’ll have a parent table and a child table. The child table’s foreign key references the parent table’s primary key. This means that any data that goes into the child table’s foreign key column must already exist in the parent’s foreign key column.
Here’s an example:
CREATE TABLE Types(
TypeId INTEGER PRIMARY KEY,
Type
);
CREATE TABLE Pets(
PetId INTEGER PRIMARY KEY,
PetName,
TypeId,
FOREIGN KEY(TypeId) REFERENCES Types(TypeId)
);
The part that creates the foreign key is this:
FOREIGN KEY(TypeId) REFERENCES Types(TypeId)
The FOREIGN KEY(TypeId)
part declares Pets.TypeId
as the foreign key.
Once these two tables have been created, whenever we insert a row into the Pets
table, we must make sure that the value we specify for the TypeId
column is a value that’s already in the TypeId
of the Types
table. If we specify a value that’s not already in the parent table, we’ll get an error. This is the whole purpose of a foreign key constraint – it prevents users from entering data that would create an orphaned record. It helps us to maintain referential integrity.
Best Practices for Using SQL Constraints
Here are some best practices we should keep in mind for constraints when designing our databases:
- Always Define Primary Keys: In most cases, every table should have a primary key to uniquely identify records. While there can be exceptions to this rule, they are usually quite rare.
- Use
NOT NULL
Where Necessary: Ensure critical fields do not accept NULL values to avoid data inconsistencies. - Regularly Update Constraints: Modify constraints as the database schema evolves to maintain data integrity.
- Document Your Constraints: Keep detailed documentation of the constraints applied to each table for better database management.
Common Mistakes to Avoid
Here are some common mistakes that we should try to avoid:
- Overusing
UNIQUE
Constraints: ApplyingUNIQUE
constraints too liberally can lead to difficulties in data entry. - Ignoring
FOREIGN KEY
Constraints: Failing to defineFOREIGN KEY
constraints can result in orphaned records, leading to data integrity issues. - Neglecting
DEFAULT
Constraints: Not usingDEFAULT
constraints can lead to unexpected NULL values in your data.