The 6 SERIAL Data Types in PostgreSQL

The SERIAL data type in PostgreSQL is a pseudo-type used to create an auto-incrementing sequence of integers for a column. It is commonly used for primary keys, as it eliminates the need to manually assign unique identifiers for each new record. PostgreSQL handles this by automatically creating a sequence object that supplies a unique number each time a new row is inserted.

PostgreSQL provides three SERIAL types, each with two options for usage; which effectively equates to six different types. So basically, we have six options to choose from when creating a SERIAL column.

Comparing the Different SERIAL Types

The following table provides a breakdown of the six different options we have when creating a SERIAL column:

SERIAL TypeStorage SizeRangeUse Case
smallserial
serial2
2 bytes1 to 32,767Small datasets, limited records
serial
serial4
4 bytes1 to 2147483647General-purpose, moderate range
bigserial
serial8
8 bytes1 to 9223372036854775807Large datasets, high-volume tables

So you can see that each SERIAL type has a corresponding serialn version, where n represents the number of bytes in that option. We have the option of specifying one or the other when creating a SERIAL column. For example, we could use bigserial or we could use serial8 to do the same thing.

Corresponding Data Types

The SERIAL types aren’t actually true data types. Rather, they’re a notational convenience for creating unique identifier columns. Each of these SERIAL “types” has a corresponding (actual) integer data type, as outlined in the following table:

Type NameCreates the Following Data Type
smallserial
serial2
smallint
serial
serial4
integer
bigserial
serial8
bigint

The flexibility of being able to specify the exact type allows us to optimize storage depending on the expected volume of data.

Use Cases for Different SERIAL Types

If you’re going to use the SERIAL type, it’s important to choose the right one:

  • smallserial/serial2: Suitable for static lookup tables or small reference tables where the number of entries is limited.
  • serial/serial4: For moderate-scale applications where a few million rows are expected. Common in most business applications.
  • bigserial/serial8: Ideal for large-scale applications, such as analytics platforms, that handle billions of rows over time.

Choosing the wrong type could have a serious impact on your database. For example, if you choose smallserial, but your table needs more than 32,767 rows, then you’ll run into trouble once it tries to insert the 32,768th row.

Conversely, if you choose bigserial but you need no more than 2,147,483,647 rows, then you’ll be using up more space than you need.

How SERIAL Works Behind the Scenes

When you define a column as SERIAL, PostgreSQL automatically creates a sequence for that column. A sequence is a special kind of database object designed to generate unique numeric values. Each time a new record is inserted, PostgreSQL fetches the next number in the sequence.

For example, if we create a table like this:

CREATE TABLE orders (
    order_id SERIAL
);

Behind the scenes, something like this will happen:

CREATE SEQUENCE orders_order_id_seq;
ALTER TABLE orders ALTER COLUMN order_id SET DEFAULT NEXTVAL('orders_order_id_seq');

The sequence object (orders_order_id_seq) keeps track of the current value, and the NEXTVAL() function retrieves the next number. This process is transparent to the user.

And the table’s definition will end up looking something like this:

CREATE TABLE IF NOT EXISTS public.orders
(
order_id integer NOT NULL DEFAULT nextval('orders_order_id_seq'::regclass)
)

Alternative to SERIAL: Identity Columns

Since PostgreSQL 10, an alternative to SERIAL columns is the IDENTITY column. This feature follows the SQL standard more closely and can be used in place of SERIAL.

Example:

  CREATE TABLE users (
      user_id INT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
      username VARCHAR(50)
  );

Identity columns provide more control, allowing users to define whether the system should always generate the value (ALWAYS) or only when no value is provided (BY DEFAULT).

Like with SERIAL columns, identity columns have an implicit sequence attached to them.