Using the AGE() Function to Compare Dates in DuckDB

When working with date and time data in DuckDB, calculating the difference between two dates is a common requirement. Whether we’re determining a person’s age from their birthdate or measuring the duration between two events, DuckDB’s age() function provides a straightforward solution. This function returns an interval representing the difference between two timestamps or dates, making it especially useful for time-based analyses.

In this article, we’ll explore how to use the age() function in DuckDB. We’ll cover its syntax, and provide some simple examples.

Understanding the age() Function

The age() function in DuckDB computes the difference between two date or timestamp values and returns an interval that represents the elapsed time. The function follows this basic syntax:

age(end_date, start_date)
  • end_date: The later date or timestamp (optional).
  • start_date: The earlier date or timestamp.
  • The function returns an INTERVAL representing the difference between the two.

If only one argument is provided, DuckDB assumes the current timestamp as the reference.

Here’s a more detailed version of its syntax:

age(TIMESTAMP) -> INTERVAL
age(TIMESTAMP, TIMESTAMP) -> INTERVAL
age(TIMESTAMP WITH TIME ZONE, TIMESTAMP WITH TIME ZONE) -> INTERVAL
age(TIMESTAMP WITH TIME ZONE) -> INTERVAL

This shows that the output is always an INTERVAL value.

Example with 1 Argument

Here’s an example of using the age() function with just one argument to get the age based on the current date:

SELECT age( TIMESTAMP '1987-06-15 12:45:37' );

Result:

37 years 8 months 20 days 05:28:19.156

We can also pass a DATE value:

SELECT age( DATE '1987-06-15' );

Result:

37 years 8 months 20 days 18:14:48.847

Example with 2 Arguments

We also have the option of passing two arguments:

SELECT age( DATE '2000-06-15', DATE '1987-06-15' );

Result:

13 years

In this case the month and day portions were the same, and so they aren’t included in the result. No time part was returned either.

Using age() with the current_date Function

We can use the current_date function to get the age based on the current date:

SELECT age( current_date, DATE '1987-06-15' );

Result:

37 years 8 months 20 days

When we do this, it doesn’t calculate the time portion. To do that, simply omit the current_date (so that only one argument is provided). You could also use current_localtimestamp() instead of current_date.

Here they are combined:

SELECT 
    age( DATE '1987-06-15' ) AS age_1,
    age( current_date, DATE '1987-06-15' ) AS age_2,
    age( current_localtimestamp(), DATE '1987-06-15' ) AS age_3;

Result:

+----------------------------------------+---------------------------+----------------------------------------+
| age_1 | age_2 | age_3 |
+----------------------------------------+---------------------------+----------------------------------------+
| 37 years 8 months 20 days 18:36:37.229 | 37 years 8 months 20 days | 37 years 8 months 21 days 04:36:37.229 |
+----------------------------------------+---------------------------+----------------------------------------+

Getting a Negative Interval

If the start date is later than the end date, DuckDB returns a negative interval:

SELECT age( DATE '2023-03-01', DATE '2024-03-01' );

Result:

-1 year

The Difference Between age() and date_diff()

DuckDB also has a date_diff() function, and it would be easy to assume that it does the same thing as the age() function. But that would be an incorrect assumption.

The date_diff() function (and its alias datediff()) returns the number of partition boundaries between the dates.

Actually, there’s also a date_sub() function (and its alias datesub()). This function is similar to date_diff(), except that it returns the number of complete partitions between the dates.

These functions return their result as a signed BIGINT (not an INTERVAL like with the age() function). They also require us to specify the date part as the first argument. Also, the dates go the other way around.

Here’s an example that illustrates the difference between these functions:

SELECT 
    age( DATE '2000-06-15', DATE '1987-06-15' ) AS 'age()',
    date_diff( 'YEARS', DATE '2000-06-15', DATE '1987-06-15' ) AS 'date_diff()',
    datediff( 'YEARS', DATE '2000-06-15', DATE '1987-06-15' ) AS 'datediff()',
    date_sub( 'YEARS', DATE '2000-06-15', DATE '1987-06-15' ) AS 'date_sub()',
    datesub( 'YEARS', DATE '2000-06-15', DATE '1987-06-15' ) AS 'datesub()';

Result:

+----------+-------------+------------+------------+-----------+
| age() | date_diff() | datediff() | date_sub() | datesub() |
+----------+-------------+------------+------------+-----------+
| 13 years | -13 | -13 | -13 | -13 |
+----------+-------------+------------+------------+-----------+

We can see that we had to provide 'YEARS' as the first argument in order to get the number of years when using these other functions. We could just have easily specified a different date part (like MONTHS for example).

Also, date_diff() and the other date_...() functions accept the dates in a different order, so we got a negative value when using those functions.

While we’re at it, let’s see how date_diff() differs to the other date_...() functions:

SELECT 
    age( DATE '2000-06-15', DATE '1987-06-15' ) AS 'age()',
    date_diff( 'DECADES', DATE '2000-06-15', DATE '1987-06-15' ) AS 'date_diff()',
    datediff( 'DECADES', DATE '2000-06-15', DATE '1987-06-15' ) AS 'datediff()',
    date_sub( 'DECADES', DATE '2000-06-15', DATE '1987-06-15' ) AS 'date_sub()',
    datesub( 'DECADES', DATE '2000-06-15', DATE '1987-06-15' ) AS 'datesub()';

Result:

+----------+-------------+------------+------------+-----------+
| age() | date_diff() | datediff() | date_sub() | datesub() |
+----------+-------------+------------+------------+-----------+
| 13 years | -2 | -2 | -1 | -1 |
+----------+-------------+------------+------------+-----------+

When we use a larger date part, that’s when we can see the difference. date_diff()/datediff() returns the number of partition boundaries between the dates, whereas the date_sub()/datesub() functions return the number of complete partitions between the dates.

In any case, if we want an interval to be returned, we should use the age() function.