In MariaDB, the REGEXP_REPLACE()
function replaces occurrences of the substring within a string that matches the given regular expression pattern.
The whole string is returned along with the replacements.
If there’s no match (i.e. the input string doesn’t contain the substring), the the whole string is returned unchanged.
Syntax
The syntax goes like this:
REGEXP_REPLACE(subject, pattern, replace)
Where subject
is the input string, pattern
is the regular expression pattern for the substring, and replace
is the replacement string.
Note that, at the time of writing, MariaDB’s version of REGEXP_REPLACE()
accepts fewer arguments than MySQL’s REGEXP_REPLACE()
function. MySQL’s version allows you to provide arguments for the starting position of the search, which occurrence to search for, as well as a way to refine the regular expression with match type.
Example
Here’s a basic example of using REGEXP_REPLACE()
in MariaDB:
SELECT REGEXP_REPLACE('Cats and dogs', 'd.g', 'bird');
Result:
+------------------------------------------------+ | REGEXP_REPLACE('Cats and dogs', 'd.g', 'bird') | +------------------------------------------------+ | Cats and birds | +------------------------------------------------+
In this case there’s a match, and the substring is replaced with the replacement string.
Regular expressions can be very powerful, and this example uses a very simple example. In order to use REGEXP_REPLACE()
effectively, you’ll need to know the correct pattern to use for the desired outcome.
Multiple Matches
Here’s an example with multiple matches:
SELECT REGEXP_REPLACE('My dog likes other dogs', 'd.g', 'bird');
Result:
+----------------------------------------------------------+ | REGEXP_REPLACE('My dog likes other dogs', 'd.g', 'bird') | +----------------------------------------------------------+ | My bird likes other birds | +----------------------------------------------------------+
No Match
Here’s an example where there’s no match:
SELECT REGEXP_REPLACE('Cats and dogs', 't.g', 'bird');
Result:
+------------------------------------------------+ | REGEXP_REPLACE('Cats and dogs', 't.g', 'bird') | +------------------------------------------------+ | Cats and dogs | +------------------------------------------------+
There’s no match, so the original string is returned unchanged.
Case Sensitivity
The REGEXP_REPLACE()
function follows the case sensitivity rules of the effective collation. Matching is performed case insensitively for case insensitive collations, and case sensitively for case sensitive collations and for binary data.
Here’s an example:
SELECT
REGEXP_REPLACE('My Cats', 'c.t', 'dog') AS "My Default",
REGEXP_REPLACE(_latin7'My Cats' COLLATE latin7_general_ci, 'c.t', 'dog') AS "Case Insensitive",
REGEXP_REPLACE(_latin7'My Cats' COLLATE latin7_general_cs, 'c.t', 'dog') AS "Case Sensitive";
Result:
+------------+------------------+----------------+ | My Default | Case Insensitive | Case Sensitive | +------------+------------------+----------------+ | My dogs | My dogs | My Cats | +------------+------------------+----------------+
My default collation is case insensitive. The other two strings were forced to a case insensitive collation and case sensitive collation respectively.
Providing a BINARY
string is also case sensitive (more on this later).
Override Case Sensitivity
The collation case sensitivity can be overridden using the (?i
) and (?-i
) PCRE flags.
Here’s the previous example, but this time using the (?-i)
flag on each pattern to force case sensitivity:
SELECT
REGEXP_REPLACE('My Cats', '(?-i)c.t', 'dog') AS "My Default",
REGEXP_REPLACE(_latin7'My Cats' COLLATE latin7_general_ci, '(?-i)c.t', 'dog') AS "Case Insensitive",
REGEXP_REPLACE(_latin7'My Cats' COLLATE latin7_general_cs, '(?-i)c.t', 'dog') AS "Case Sensitive";
Result:
+------------+------------------+----------------+ | My Default | Case Insensitive | Case Sensitive | +------------+------------------+----------------+ | My Cats | My Cats | My Cats | +------------+------------------+----------------+
And here it is using the (?i)
flag to force case insensitivity:
SELECT
REGEXP_REPLACE('My Cats', '(?i)c.t', 'dog') AS "My Default",
REGEXP_REPLACE(_latin7'My Cats' COLLATE latin7_general_ci, '(?i)c.t', 'dog') AS "Case Insensitive",
REGEXP_REPLACE(_latin7'My Cats' COLLATE latin7_general_cs, '(?i)c.t', 'dog') AS "Case Sensitive";
Result:
+------------+------------------+----------------+ | My Default | Case Insensitive | Case Sensitive | +------------+------------------+----------------+ | My dogs | My dogs | My dogs | +------------+------------------+----------------+
Binary Strings
Passing a BINARY
string also affects the case sensitivity. With BINARY
strings, an upper case character is different to its lower case counterpart:
Example:
SELECT
REGEXP_REPLACE('My Cats', 'c.t', 'dog') AS "Character",
REGEXP_REPLACE(BINARY 'My Cats', 'c.t', 'dog') AS "Binary";
Result:
+-----------+---------+ | Character | Binary | +-----------+---------+ | My dogs | My Cats | +-----------+---------+
Here’s what happens if we change the case:
SELECT
REGEXP_REPLACE('My Cats', 'C.t', 'dog') AS "Character",
REGEXP_REPLACE(BINARY 'My Cats', 'C.t', 'dog') AS "Binary";
Result:
+-----------+---------+ | Character | Binary | +-----------+---------+ | My dogs | My dogs | +-----------+---------+
Case sensitivity on BINARY
strings can also be overridden with the (?-i)
and (?i)
PCRE flags:
SELECT
REGEXP_REPLACE('My Cats', '(?-i)c.t', 'dog') AS "Character",
REGEXP_REPLACE(BINARY 'My Cats', '(?-i)c.t', 'dog') AS "Binary";
Result:
+-----------+---------+ | Character | Binary | +-----------+---------+ | My Cats | My Cats | +-----------+---------+
Null Arguments
Passing null
as any argument results in null
:
SELECT
REGEXP_REPLACE(null, 'c.t', 'dog') AS "1",
REGEXP_REPLACE('Cat', null, 'dog') AS "2",
REGEXP_REPLACE('Cat', 'c.t', null) AS "3",
REGEXP_REPLACE(null, null, null) AS "4";
Result:
+------+------+------+------+ | 1 | 2 | 3 | 4 | +------+------+------+------+ | NULL | NULL | NULL | NULL | +------+------+------+------+
Wrong Number of Arguments
Passing the wrong number of arguments or no arguments, results in an error:
SELECT REGEXP_REPLACE();
Result:
ERROR 1582 (42000): Incorrect parameter count in the call to native function 'REGEXP_REPLACE'