How REGEXP_SPLIT_TO_TABLE() Works in PostgreSQL

In PostgreSQL, the regexp_split_to_table() function splits a string using a POSIX regular expression as the delimiter, and returns the result in a table.

So we use a POSIX regular expression to specify the delimiter/s, and split the string based on that.

We pass the string as the first argument and the pattern as the second. We can also specify a flag to change the behaviour of the function.

Example

Here’s a basic example to demonstrate:

SELECT regexp_split_to_table('cat dog bird', '\s+');

Result:

 regexp_split_to_table 
-----------------------
cat
dog
bird
(3 rows)

Here, I used \s+ as the regular expression, which identifies the space character as the delimiter in the string at the first argument.

Here are some more examples:

\x
SELECT 
    regexp_split_to_table('cat dog bird', '\W+') AS "1",
    regexp_split_to_table('cat+dog+bird', '\W+') AS "2",
    regexp_split_to_table('cat + dog + bird', '\W+') AS "3",
    regexp_split_to_table('cat & dog & bird', '\W+') AS "4",
    regexp_split_to_table('cat&dog+bird fox', '[&+ ]+') AS "5";

Result:

  1   |  2   |  3   |  4   |  5   
------+------+------+------+------
cat | cat | cat | cat | cat
dog | dog | dog | dog | dog
bird | bird | bird | bird | bird
null | null | null | null | fox
(4 rows)

No Match

If there are no matches, the original string is returned in a single column on a single row:

SELECT regexp_split_to_table('cat dog bird', ',');

Result:

 regexp_split_to_table 
-----------------------
cat dog bird
(1 row)

Adding a Flag

We also have the option of adding a flag as a third argument. Such flags can be used to change the behaviour of the function. The following table contains the other flags we can use:

OptionDescription
bRest of RE is a BRE
cCase-sensitive matching (overrides operator type)
eRest of RE is an ERE
iCase-insensitive matching (overrides operator type)
mHistorical synonym for n
nNewline-sensitive matching
pPartial newline-sensitive matching
qRest of RE is a literal (“quoted”) string, all ordinary characters
sNon-newline-sensitive matching (default)
tTight syntax (default)
wInverse partial newline-sensitive matching
xExpanded syntax

Source: https://www.postgresql.org/docs/current/functions-matching.html#POSIX-METASYNTAX

Here’s an example of using the i flag to specify a case insensitive search:

SELECT regexp_split_to_table('cat AND dog OR bird', '\s(and|or)\s', 'i');

Result:

 regexp_split_to_table 
-----------------------
cat
dog
bird
(3 rows)

Here it is without the flag:

SELECT regexp_split_to_table('cat AND dog OR bird', '\s(and|or)\s');

Result:

 regexp_split_to_table 
-----------------------
cat AND dog OR bird
(1 row)

It didn’t split the string because I used lowercase characters in my regex when the string uses uppercase and the default behaviour is case sensitive. Given I didn’t use the i flag to specify it as case insensitive, there was no match.

More About POSIX Regular Expressions in PostgreSQL

See the PostgreSQL documentation for an overview of how POSIX regular expressions work and the various options available when using the regex_split_to_table() function.