In DuckDB, the generate_series()
function is a powerful tool for generating sequences of values, which can be very useful for data analysis, testing, and creating sample datasets. This function enables us to easily create series of numbers, dates, and timestamps with minimal code.
Syntax
The generate_series()
function can produce a series of numbers or date/time values.
When used to generate a series of numbers, it can be used in the following ways:
generate_series(stop)
generate_series(start, stop)
generate_series(start, stop, step)
When used to generate a series of date/time values, the syntax goes something like this:
generate_series(start, stop, interval)
Here’s a more detailed depiction of the various ways we can use the function:
generate_series(BIGINT) -> BIGINT[]
generate_series(BIGINT, BIGINT) -> BIGINT[]
generate_series(BIGINT, BIGINT, BIGINT) -> BIGINT[]
generate_series(TIMESTAMP, TIMESTAMP, INTERVAL) -> TIMESTAMP[]
generate_series(TIMESTAMP WITH TIME ZONE, TIMESTAMP WITH TIME ZONE, INTERVAL) -> TIMESTAMP WITH TIME ZONE[]
Basic Numeric Series
A common use of generate_series()
is to generate a sequence of integers. When we do this we have the option of specifying the start and end point, or just the end point. We also have the option of specifying a step amount.
Here’s an example that uses the simplest form, where we specify just the end point:
SELECT generate_series(5);
Result:
+--------------------+
| generate_series(5) |
+--------------------+
| [0, 1, 2, 3, 4, 5] |
+--------------------+
By specifying just the end point, the function started at zero and ended at the end point (which in this case is 5).
We can use the following technique to have the series displayed in a table, rather than an array:
SELECT * FROM generate_series(5);
Result:
+-----------------+
| generate_series |
+-----------------+
| 0 |
| 1 |
| 2 |
| 3 |
| 4 |
| 5 |
+-----------------+
Alternatively, we could use the unnest()
function:
SELECT unnest(generate_series(5));
Result:
+----------------------------+
| unnest(generate_series(5)) |
+----------------------------+
| 0 |
| 1 |
| 2 |
| 3 |
| 4 |
| 5 |
+----------------------------+
We can also provide a start point if we so wish:
SELECT * FROM generate_series(1, 5);
Result:
+-----------------+
| generate_series |
+-----------------+
| 1 |
| 2 |
| 3 |
| 4 |
| 5 |
+-----------------+
We can also specify a step value as a third argument. Here’s an example where we generate even numbers from 0 to 10:
SELECT * FROM generate_series(0, 10, 2);
Output:
+-----------------+
| generate_series |
+-----------------+
| 0 |
| 2 |
| 4 |
| 6 |
| 8 |
| 10 |
+-----------------+
Negative steps work for descending sequences:
SELECT * FROM generate_series(10, 0, -2);
Output:
+-----------------+
| generate_series |
+-----------------+
| 10 |
| 8 |
| 6 |
| 4 |
| 2 |
| 0 |
+-----------------+
Generating a Date Series
As alluded to, we can also use generate_series()
to create date sequences.
Here’s an example of generating a series of dates for the first seven days in January 2025:
SELECT * FROM generate_series(
DATE '2025-01-01',
DATE '2025-01-07',
INTERVAL '1 day'
);
Result:
+---------------------+
| generate_series |
+---------------------+
| 2025-01-01 00:00:00 |
| 2025-01-02 00:00:00 |
| 2025-01-03 00:00:00 |
| 2025-01-04 00:00:00 |
| 2025-01-05 00:00:00 |
| 2025-01-06 00:00:00 |
| 2025-01-07 00:00:00 |
+---------------------+
We can add a third argument in order to specify an interval, such as weeks. Let’s generate dates for each Monday in Q1 2025:
SELECT * FROM generate_series(
DATE '2025-01-06', -- First Monday of 2025
DATE '2025-03-31', -- End of Q1
INTERVAL '1 week'
);
Result:
+---------------------+
| generate_series |
+---------------------+
| 2025-01-06 00:00:00 |
| 2025-01-13 00:00:00 |
| 2025-01-20 00:00:00 |
| 2025-01-27 00:00:00 |
| 2025-02-03 00:00:00 |
| 2025-02-10 00:00:00 |
| 2025-02-17 00:00:00 |
| 2025-02-24 00:00:00 |
| 2025-03-03 00:00:00 |
| 2025-03-10 00:00:00 |
| 2025-03-17 00:00:00 |
| 2025-03-24 00:00:00 |
| 2025-03-31 00:00:00 |
+---------------------+
Timestamp Series
We can see from the previous examples that the generate_series()
function returns its result as a TIMESTAMP
, even when we pass DATE
values. Here’s an example of passing actual TIMESTAMP
values to generate a time series across a single day:
SELECT * FROM generate_series(
TIMESTAMP '2025-03-14 08:00:00',
TIMESTAMP '2025-03-14 17:00:00',
INTERVAL '1 hour'
);
Output:
+---------------------+
| generate_series |
+---------------------+
| 2025-03-14 08:00:00 |
| 2025-03-14 09:00:00 |
| 2025-03-14 10:00:00 |
| 2025-03-14 11:00:00 |
| 2025-03-14 12:00:00 |
| 2025-03-14 13:00:00 |
| 2025-03-14 14:00:00 |
| 2025-03-14 15:00:00 |
| 2025-03-14 16:00:00 |
| 2025-03-14 17:00:00 |
+---------------------+
Timestamp With Time Zone
The generate_series()
function also supports TIMESTAMP WITH TIME ZONE
values (also known as TIMESTAMPTZ
):
SELECT * FROM generate_series(
TIMESTAMPTZ '2025-03-14 08:00:00+05:30',
TIMESTAMPTZ '2025-03-14 17:00:00+05:30',
INTERVAL '1 hour'
);
Result on my machine:
+------------------------+
| generate_series |
+------------------------+
| 2025-03-14 12:30:00+10 |
| 2025-03-14 13:30:00+10 |
| 2025-03-14 14:30:00+10 |
| 2025-03-14 15:30:00+10 |
| 2025-03-14 16:30:00+10 |
| 2025-03-14 17:30:00+10 |
| 2025-03-14 18:30:00+10 |
| 2025-03-14 19:30:00+10 |
| 2025-03-14 20:30:00+10 |
| 2025-03-14 21:30:00+10 |
+------------------------+
An Alternative: The range()
Function
DuckDB also has a range()
function, which works almost the same as generate_series()
, with the exception that the stop
value is exclusive.
Here’s an example that demonstrates the difference:
SELECT
range(5),
generate_series(5);
Result:
+-----------------+--------------------+
| "range"(5) | generate_series(5) |
+-----------------+--------------------+
| [0, 1, 2, 3, 4] | [0, 1, 2, 3, 4, 5] |
+-----------------+--------------------+
Here, range(5)
went from zero to four, whereas generate_series(5)
went from zero to five. Other than that, they work pretty much the same.