How REGEXP_SPLIT_TO_TABLE() Works in PostgreSQL

In PostgreSQL, the regexp_split_to_table() function splits a string using a POSIX regular expression as the delimiter, and returns the result in a table.

So we use a POSIX regular expression to specify the delimiter/s, and split the string based on that.

We pass the string as the first argument and the pattern as the second. We can also specify a flag to change the behaviour of the function.

Example

Here’s a basic example to demonstrate:

SELECT regexp_split_to_table('cat dog bird', '\s+');

Result:

 regexp_split_to_table 
-----------------------
 cat
 dog
 bird
(3 rows)

Here, I used \s+ as the regular expression, which identifies the space character as the delimiter in the string at the first argument.

Here are some more examples:

\x
SELECT 
    regexp_split_to_table('cat dog bird', '\W+') AS "1",
    regexp_split_to_table('cat+dog+bird', '\W+') AS "2",
    regexp_split_to_table('cat + dog + bird', '\W+') AS "3",
    regexp_split_to_table('cat & dog & bird', '\W+') AS "4",
    regexp_split_to_table('cat&dog+bird fox', '[&+ ]+') AS "5";

Result:

  1   |  2   |  3   |  4   |  5   
------+------+------+------+------
 cat  | cat  | cat  | cat  | cat
 dog  | dog  | dog  | dog  | dog
 bird | bird | bird | bird | bird
 null | null | null | null | fox
(4 rows)

No Match

If there are no matches, the original string is returned in a single column on a single row:

SELECT regexp_split_to_table('cat dog bird', ',');

Result:

 regexp_split_to_table 
-----------------------
 cat dog bird
(1 row)

Adding a Flag

We also have the option of adding a flag as a third argument. Such flags can be used to change the behaviour of the function. The following table contains the other flags we can use:

Option	Description
`b`	Rest of RE is a BRE
`c`	Case-sensitive matching (overrides operator type)
`e`	Rest of RE is an ERE
`i`	Case-insensitive matching (overrides operator type)
`m`	Historical synonym for `n`
`n`	Newline-sensitive matching
`p`	Partial newline-sensitive matching
`q`	Rest of RE is a literal (“quoted”) string, all ordinary characters
`s`	Non-newline-sensitive matching (default)
`t`	Tight syntax (default)
`w`	Inverse partial newline-sensitive matching
`x`	Expanded syntax

Source: https://www.postgresql.org/docs/current/functions-matching.html#POSIX-METASYNTAX

Here’s an example of using the i flag to specify a case insensitive search:

SELECT regexp_split_to_table('cat AND dog OR bird', '\s(and|or)\s', 'i');

Result:

 regexp_split_to_table 
-----------------------
 cat
 dog
 bird
(3 rows)

Here it is without the flag:

SELECT regexp_split_to_table('cat AND dog OR bird', '\s(and|or)\s');

Result:

 regexp_split_to_table 
-----------------------
 cat AND dog OR bird
(1 row)

It didn’t split the string because I used lowercase characters in my regex when the string uses uppercase and the default behaviour is case sensitive. Given I didn’t use the i flag to specify it as case insensitive, there was no match.

Database.Guide

How REGEXP_SPLIT_TO_TABLE() Works in PostgreSQL

Example

No Match

Adding a Flag

More About POSIX Regular Expressions in PostgreSQL