In PostgreSQL, we can use the substring()
function to extract the first substring matching the specified POSIX or SQL regular expression.
POSIX Regular Expressions
When used with POSIX regular expressions, the syntax of substring()
goes like this:
substring ( string text FROM pattern text )
Here’s an example that uses a POSIX regular expression:
SELECT substring('Restaurant' FROM 'R..t');
Result:
Rest
In POSIX, the dot (.
) matches any character. We had two dots which matched e
and s
, and so the function returned the substring Rest
because that’s the first instance in our string that matches this particular POSIX regular expression.
Here are some more POSIX examples:
\x
SELECT
substring('Restaurant' FROM 'Rest') AS "Rest",
substring('Restaurant' FROM 'R.t') AS "R.t",
substring('Restaurant' FROM 'R*t') AS "R*t",
substring('Restaurant' FROM 'r..t') AS "r..t",
substring('Restaurant' FROM '^R') AS "^R",
substring('Restaurant' FROM '^r') AS "^r",
substring('Restaurant' FROM 'beer|sta') AS "beer|sta";
Result (using vertical output):
Rest | Rest
R.t | null
R*t | t
r..t | rant
^R | R
^r | null
beer|sta | sta
SQL Regular Expressions
When used with SQL regular expressions, the syntax goes like this:
substring ( string text SIMILAR pattern text ESCAPE escape text )
The above form was specified in SQL:2003.
The following form was specified in SQL:1999 and is considered obsolete:
substring ( string text FROM pattern text FOR escape text )
So we should focus only on the first syntax.
Here’s an example that uses a SQL regular expression:
SELECT substring('Restaurant' SIMILAR '%#"R__t#"%' ESCAPE '#');
Result (using vertical output):
Rest
To indicate the part of the pattern for which the matching data sub-string is of interest, the pattern should contain two occurrences of the escape character followed by a double quote ("
). The text matching the portion of the pattern between these separators is returned when the match is successful.
Here are some more examples:
SELECT
substring('Restaurant' SIMILAR 'R%t' ESCAPE '#') AS "R%t",
substring('Restaurant' SIMILAR 's%a' ESCAPE '#') AS "s%a",
substring('Restaurant' SIMILAR '%#"s_a#"%' ESCAPE '#') AS "%#""s_a#""%",
substring('Restaurant' SIMILAR '%#"s%a#"%' ESCAPE '#') AS "%#""s%a#""%",
substring('Restaurant' SIMILAR '%(sta|xyz)%' ESCAPE '#') AS "%(sta|xyz)%",
substring('Restaurant' SIMILAR '%#"(sta|xyz)#"%' ESCAPE '#') AS "%#""(sta|xyz)#""%";
Result:
R%t | Restaurant
s%a | null
%#"s_a#"% | sta
%#"s%a#"% | staura
%(sta|xyz)% | Restaurant
%#"(sta|xyz)#"% | sta