Dealing with Different Date Formats When Using STRPTIME() in DuckDB

In DuckDB, the strptime() function converts a date/time string into a valid timestamp value. We pass a format string to the function in order to tell it what format our string uses. This can be handy if we ever need to construct timestamps based on date/time strings that may or may not be in a valid format.

But what if we have multiple date/time strings in different formats?

Fortunately, the strptime() function caters for this scenario too.

How it Works

When we use the strptime() function in DuckDB, we have the option of providing a single format string, or a list of format strings. So it goes something like this:

strptime( text, format_string )
strptime( text, [ format_string, format_string, ...] )

If we choose the later, DuckDB will try each format string in the list until one succeeds.

Quick Example

Here’s a basic example of providing multiple format strings:

SELECT strptime('15/07/2023', ['%m/%d/%Y','%d/%m/%Y', '%A, %B %d, %Y']);

Output:

2023-07-15 00:00:00

Here, I provided a single date value, but three format strings. DuckDB then started going through the list of format strings:

  • Format string 1: It skipped the first format string, because that wouldn’t have worked (it would have resulted in 15 being assigned to the month part, but the month part can’t be higher than 12).
  • Format string 2: After skipping the first format string, it tried the second one. In this case the second one succeeded, and so it parsed the date string based on that format.
  • Format string 3: It ignored the third format string because the second one succeeded.

Examples with Strings in Multiple Formats

Suppose we create a table with date strings in multiple formats:

-- Create a table with date strings in various formats
CREATE OR REPLACE TABLE date_examples AS
SELECT * FROM (
    VALUES
    ('2023-01-15', 'ISO format (YYYY-MM-DD)'),
    ('01/15/2023', 'US format (MM/DD/YYYY)'),
    ('15/01/2023', 'European format (DD/MM/YYYY)'),
    ('15-Jan-2023', 'Custom format with month name'),
    ('Jan 15, 2023', 'Month Day, Year format'),
    ('2023.01.15', 'Dot-separated format'),
    ('20230115', 'Basic format without separators'),
    ('Sunday, January 15, 2023', 'Full date with weekday')
) AS t(date_string, description);

-- View the sample data
SELECT * FROM date_examples;

Output:

+--------------------------+---------------------------------+
| date_string | description |
+--------------------------+---------------------------------+
| 2023-01-15 | ISO format (YYYY-MM-DD) |
| 01/15/2023 | US format (MM/DD/YYYY) |
| 15/01/2023 | European format (DD/MM/YYYY) |
| 15-Jan-2023 | Custom format with month name |
| Jan 15, 2023 | Month Day, Year format |
| 2023.01.15 | Dot-separated format |
| 20230115 | Basic format without separators |
| Sunday, January 15, 2023 | Full date with weekday |
+--------------------------+---------------------------------+

This table contains dates in lots of different formats.

Fortunately, we can provide a list of format strings to strptime() in order to deal with the various formats when running a query against that table:

SELECT 
    date_string,
    description,
    -- Try multiple formats, returning the first one that successfully parses
    strptime(date_string, [
        '%Y-%m-%d',        -- ISO format
        '%m/%d/%Y',        -- US format
        '%d/%m/%Y',        -- European format
        '%d-%b-%Y',        -- Custom format with abbreviated month name
        '%b %d, %Y',       -- Month Day, Year format
        '%Y.%m.%d',        -- Dot-separated format
        '%Y%m%d',          -- Basic format without separators
        '%A, %B %d, %Y'    -- Full date with weekday
    ]) AS parsed_date
FROM date_examples;

Result:

+--------------------------+---------------------------------+---------------------+
| date_string | description | parsed_date |
+--------------------------+---------------------------------+---------------------+
| 2023-01-15 | ISO format (YYYY-MM-DD) | 2023-01-15 00:00:00 |
| 01/15/2023 | US format (MM/DD/YYYY) | 2023-01-15 00:00:00 |
| 15/01/2023 | European format (DD/MM/YYYY) | 2023-01-15 00:00:00 |
| 15-Jan-2023 | Custom format with month name | 2023-01-15 00:00:00 |
| Jan 15, 2023 | Month Day, Year format | 2023-01-15 00:00:00 |
| 2023.01.15 | Dot-separated format | 2023-01-15 00:00:00 |
| 20230115 | Basic format without separators | 2023-01-15 00:00:00 |
| Sunday, January 15, 2023 | Full date with weekday | 2023-01-15 00:00:00 |
+--------------------------+---------------------------------+---------------------+

We can see that the parsed_date column presents all dates as valid dates using the standard ISO 8601 date format.

Combining strptime() With Other Date Functions

Once we’ve parsed the date/time values with strptime(), we can use the resulting timestamp value to perform further operations. For example, we can use other date functions to extract date parts from the timestamp that’s returned by the strptime() function:

WITH parsed_dates AS (
    SELECT 
        date_string,
        strptime(date_string, [
            '%Y-%m-%d',
            '%m/%d/%Y',
            '%d/%m/%Y',
            '%d-%b-%Y',
            '%b %d, %Y',
            '%Y.%m.%d',
            '%Y%m%d',
            '%A, %B %d, %Y'
        ]) AS parsed_date
    FROM date_examples
)
SELECT 
    date_string,
    parsed_date,
    EXTRACT(YEAR FROM parsed_date) AS year,
    EXTRACT(MONTH FROM parsed_date) AS month,
    EXTRACT(DAY FROM parsed_date) AS day,
    DAYNAME(parsed_date) AS day_of_week
FROM parsed_dates;

Result:

+--------------------------+---------------------+------+-------+-----+-------------+
| date_string | parsed_date | year | month | day | day_of_week |
+--------------------------+---------------------+------+-------+-----+-------------+
| 2023-01-15 | 2023-01-15 00:00:00 | 2023 | 1 | 15 | Sunday |
| 01/15/2023 | 2023-01-15 00:00:00 | 2023 | 1 | 15 | Sunday |
| 15/01/2023 | 2023-01-15 00:00:00 | 2023 | 1 | 15 | Sunday |
| 15-Jan-2023 | 2023-01-15 00:00:00 | 2023 | 1 | 15 | Sunday |
| Jan 15, 2023 | 2023-01-15 00:00:00 | 2023 | 1 | 15 | Sunday |
| 2023.01.15 | 2023-01-15 00:00:00 | 2023 | 1 | 15 | Sunday |
| 20230115 | 2023-01-15 00:00:00 | 2023 | 1 | 15 | Sunday |
| Sunday, January 15, 2023 | 2023-01-15 00:00:00 | 2023 | 1 | 15 | Sunday |
+--------------------------+---------------------+------+-------+-----+-------------+

Error Handling

If none of the formats succeed, then an error is returned:

SELECT strptime( '15/07/2023', [ '%m/%d/%Y', '%A, %B %d, %Y' ] );

Output:

Invalid Input Error: Could not parse string "15/07/2023" according to format specifier "%m/%d/%Y"
15/07/2023
^
Error: Expected a full day name (Monday, Tuesday, etc...)

If you’d prefer such cases not to return an error, use the try_strptime() function instead. This function returns NULL in such cases:

.nullvalue 'null'
SELECT try_strptime( '15/07/2023', [ '%m/%d/%Y', '%A, %B %d, %Y' ] );

Output:

null