Understanding Domains in PostgreSQL

PostgreSQL offers a handy feature called domains that can greatly enhance data integrity and simplify database design. In this article, we’ll explore what domains are, how they work, and when to use them in our database projects.

What is a Domain?

A domain in PostgreSQL is basically a data type with optional constraints. We can create our own domains as a kind of user-defined data type. It’s basically a way for us to use an existing data type, place extra restrictions on it, and save it so that we can reuse it across multiple columns/tables. This way, we get the functionality of the underlying data type, plus any extra stuff we’ve defined (such as a default value or a constraint).

Think of it as a reusable template for column definitions. For example, we could create an email_address domain that we can use in any column that stores email addresses.

Benefits of Using Domains

Some of the benefits of creating and using domains could include:

  • Improved data integrity.
  • Simplified database design.
  • Easier maintenance and updates.
  • Better code readability.
  • Reusability across multiple tables.

When to Use Domains

Domains are particularly useful in the following scenarios:

  • When we need to enforce the same constraints on multiple columns across different tables.
  • When we want to create custom data types that encapsulate business rules.
  • When we need to simplify complex CHECK constraints by packaging them into a reusable unit.
  • When we want to improve the readability and maintainability of our database schema.

Creating and Using Domains

Let’s look at some examples to understand how to create and use domains in PostgreSQL.

Example 1: Email Domain

Suppose we want to ensure that all email addresses in our database follow a specific pattern. We can create a domain for this:

CREATE DOMAIN email_address AS VARCHAR(255)
CHECK (VALUE ~ '^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$');

CREATE TABLE users (
    id SERIAL PRIMARY KEY,
    name VARCHAR(100),
    email email_address NOT NULL
);

In this example, we created an email_address domain that checks if the input matches a basic email pattern. We do this using a CHECK constraint.

The users table then uses this domain for the email column, ensuring that all inserted email addresses conform to the specified pattern.

Now let’s try to insert data:

INSERT INTO users (name, email) VALUES
('John Doe', '[email protected]'),
('Jane Smith', '[email protected]'),
('Invalid User', 'not-an-email');  -- This will fail

Result:

ERROR:  value for domain email_address violates check constraint "email_address_check"

As expected, we got an error, due to the last user containing an invalid email address.

Let’s change that last row to use a valid email address:

INSERT INTO users (name, email) VALUES
('John Doe', '[email protected]'),
('Jane Smith', '[email protected]'),
('Steve Jenkins', '[email protected]');

Result:

INSERT 0 3

This time all three rows were inserted as expected.

Example 2: Age Domain with Default Value

Let’s create a domain for age that ensures it’s within a reasonable range:

CREATE DOMAIN age_years AS INTEGER
CHECK (VALUE >= 0 AND VALUE <= 150);

CREATE TABLE employees (
    id SERIAL PRIMARY KEY,
    name VARCHAR(100),
    age age_years
);

Here, we’ve created an age_years domain that ensures the age is between 0 and 150. The employees table uses this domain for the age column.

Now let’s try inserting data:

INSERT INTO employees (name, age) VALUES
('Alice Johnson', 30),
('Charlie Brown', 200);  -- This will fail

SELECT * FROM employees;

Result:

ERROR:  value for domain age_years violates check constraint "age_years_check"

Again we get an error due to data violating the constraints of our domain.

Let’s change the last user’s age to one that fits within the constraint:

INSERT INTO employees (name, age) VALUES
('Alice Johnson', 30),
('Charlie Brown', 100);

SELECT * FROM employees;

Result:

 id |     name      | age 
----+---------------+-----
2 | Alice Johnson | 30
3 | Charlie Brown | 100

This time it succeeded.

Example 3: Currency Domain

For financial applications, we might want to ensure that monetary values are handled consistently:

CREATE DOMAIN money_usd AS DECIMAL(10, 2)
CHECK (VALUE >= 0)
DEFAULT 0.00;

CREATE TABLE products (
    id SERIAL PRIMARY KEY,
    name VARCHAR(100),
    price money_usd NOT NULL
);

This money_usd domain ensures that all monetary values are non-negative and have exactly two decimal places.

Let’s try it out:

INSERT INTO products (name, price) VALUES
('Widget', 19.99),
('Gadget', 49.50),
('Gizmo', -5.00);  -- This will fail

Result

ERROR:  value for domain money_usd violates check constraint "money_usd_check"

The third product failed due to having a negative price.

Let’s switch it to a positive value and try again:

INSERT INTO products (name, price) VALUES
('Widget', 19.99),
('Gadget', 49.50),
('Gizmo', 5.00);

Result

INSERT 0 3

Success.

Modifying Domains

You can alter existing domains using the ALTER DOMAIN command. For example, to add a new constraint to our email_address domain:

ALTER DOMAIN email_address
ADD CONSTRAINT email_length CHECK (LENGTH(VALUE) >= 5);

This adds an additional check to ensure that email addresses are at least 5 characters long.

Retrieve a List of Domains

When using the psql command-line interface, we can use the following command to get a list of user defined domains:

\dD

Result:

 Schema |     Name      |          Type          | Collation | Nullable | Default |                                                      Check                                                      
--------+---------------+------------------------+-----------+----------+---------+-----------------------------------------------------------------------------------------------------------------
public | age_years | integer | | | | CHECK (VALUE >= 0 AND VALUE <= 150)
public | email_address | character varying(255) | | | | CHECK (VALUE::text ~ '^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$'::text) CHECK (length(VALUE::text) >= 5)
public | money_usd | numeric(10,2) | | | 0.00 | CHECK (VALUE >= 0::numeric)
(3 rows)

This command returns a lot of useful information about the domains, including its base type, CHECK constraints, default value, etc.

We can also see the updated definition for the email_address domain (i.e. the new constraint that we added when we modified the domain in the previous example).

Whenever we’re not using psql, we can run something like this to return a simple list of domain names:

SELECT typname
FROM pg_catalog.pg_type
WHERE typtype = 'd';

Result:

     typname     
-----------------
cardinal_number
character_data
sql_identifier
time_stamp
yes_or_no
email_address
age_years
money_usd
(8 rows)

We can see that the domains we created are the last three domains in this list.

The first five domains in this list are actually built in domains for the information schema only. Every column in the information schema has one of these five types. The PostgreSQL documentation states:

You should not use these types for work outside the information schema, but your applications must be prepared for them if they select from the information schema.

We can use a more complex query to get more detailed info about the domains:

SELECT 
    n.nspname AS schema_name,
    t.typname AS domain_name,
    pg_catalog.format_type(t.typbasetype, t.typtypmod) AS base_type,
    t.typnotnull AS not_null,
    t.typdefault AS default_value,
    pg_catalog.array_to_string(ARRAY(
        SELECT pg_catalog.pg_get_constraintdef(r.oid, true)
        FROM pg_catalog.pg_constraint r
        WHERE t.oid = r.contypid
    ), ' AND ') AS check_constraints,
    d.description AS comment
FROM pg_catalog.pg_type t
LEFT JOIN pg_catalog.pg_namespace n ON n.oid = t.typnamespace
LEFT JOIN pg_catalog.pg_description d ON d.objoid = t.oid AND d.classoid = 'pg_type'::regclass
WHERE t.typtype = 'd'
AND n.nspname NOT IN ('pg_catalog', 'information_schema')
ORDER BY schema_name, domain_name;

Result:

                                                | comment 
-------------+---------------+------------------------+----------+---------------+---------------------------------------------------------------------------------------------------------------------+---------
public | age_years | integer | f | | CHECK (VALUE >= 0 AND VALUE <= 150) |
public | email_address | character varying(255) | f | | CHECK (VALUE::text ~ '^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$'::text) AND CHECK (length(VALUE::text) >= 5) |
public | money_usd | numeric(10,2) | f | 0.00 | CHECK (VALUE >= 0::numeric) |
(3 rows)

This basically provides the same info that we got when using the \dD command in psql.

Conclusion

PostgreSQL domains are a powerful feature that can help us maintain data integrity, simplify our database design, and improve code readability. By creating these custom data types with built-in constraints and default values, we can ensure data consistency across our databases and reduce the likelihood of data-related errors in our applications.