In PostgreSQL, the regexp_split_to_table()
function splits a string using a POSIX regular expression as the delimiter, and returns the result in a table.
So we use a POSIX regular expression to specify the delimiter/s, and split the string based on that.
We pass the string as the first argument and the pattern as the second. We can also specify a flag to change the behaviour of the function.
Example
Here’s a basic example to demonstrate:
SELECT regexp_split_to_table('cat dog bird', '\s+');
Result:
regexp_split_to_table
-----------------------
cat
dog
bird
(3 rows)
Here, I used \s+
as the regular expression, which identifies the space character as the delimiter in the string at the first argument.
Here are some more examples:
\x
SELECT
regexp_split_to_table('cat dog bird', '\W+') AS "1",
regexp_split_to_table('cat+dog+bird', '\W+') AS "2",
regexp_split_to_table('cat + dog + bird', '\W+') AS "3",
regexp_split_to_table('cat & dog & bird', '\W+') AS "4",
regexp_split_to_table('cat&dog+bird fox', '[&+ ]+') AS "5";
Result:
1 | 2 | 3 | 4 | 5
------+------+------+------+------
cat | cat | cat | cat | cat
dog | dog | dog | dog | dog
bird | bird | bird | bird | bird
null | null | null | null | fox
(4 rows)
No Match
If there are no matches, the original string is returned in a single column on a single row:
SELECT regexp_split_to_table('cat dog bird', ',');
Result:
regexp_split_to_table
-----------------------
cat dog bird
(1 row)
Adding a Flag
We also have the option of adding a flag as a third argument. Such flags can be used to change the behaviour of the function. The following table contains the other flags we can use:
Option | Description |
---|---|
b | Rest of RE is a BRE |
c | Case-sensitive matching (overrides operator type) |
e | Rest of RE is an ERE |
i | Case-insensitive matching (overrides operator type) |
m | Historical synonym for n |
n | Newline-sensitive matching |
p | Partial newline-sensitive matching |
q | Rest of RE is a literal (“quoted”) string, all ordinary characters |
s | Non-newline-sensitive matching (default) |
t | Tight syntax (default) |
w | Inverse partial newline-sensitive matching |
x | Expanded syntax |
Source: https://www.postgresql.org/docs/current/functions-matching.html#POSIX-METASYNTAX
Here’s an example of using the i
flag to specify a case insensitive search:
SELECT regexp_split_to_table('cat AND dog OR bird', '\s(and|or)\s', 'i');
Result:
regexp_split_to_table
-----------------------
cat
dog
bird
(3 rows)
Here it is without the flag:
SELECT regexp_split_to_table('cat AND dog OR bird', '\s(and|or)\s');
Result:
regexp_split_to_table
-----------------------
cat AND dog OR bird
(1 row)
It didn’t split the string because I used lowercase characters in my regex when the string uses uppercase and the default behaviour is case sensitive. Given I didn’t use the i
flag to specify it as case insensitive, there was no match.
More About POSIX Regular Expressions in PostgreSQL
See the PostgreSQL documentation for an overview of how POSIX regular expressions work and the various options available when using the regex_split_to_table()
function.