In MariaDB, the REGEXP
operator is used to determine whether or not a string matches a regular expression.
If the string matches the regular expression provided, the result is 1
, otherwise it’s 0
.
Syntax
The syntax goes like this:
expr REGEXP pat
Where expr
is the input string and pat
is the regular expression for which you’re testing the string against.
Examples
Below are examples of using the REGEXP
operator with different patterns.
Basic Regular Expression
Let’s start with the simplest example. The simplest regular expression we can use is one that has no special characters in it. Here, we just use a string. If any part of the input string matches that string, it returns a match.
SELECT
'Corn' REGEXP 'Corn' AS "Corn",
'Acorn' REGEXP 'Corn' AS "Acorn",
'Corner' REGEXP 'Corn' AS "Corner",
'Cheese' REGEXP 'Corn' AS "Cheese";
Result:
+------+-------+--------+--------+ | Corn | Acorn | Corner | Cheese | +------+-------+--------+--------+ | 1 | 1 | 1 | 0 | +------+-------+--------+--------+
Match the Beginning of a String
In this example, the regular expression specifies that the string must begin with Co
.
SELECT
'Corn' REGEXP '^Co' AS "Corn",
'Acorn' REGEXP '^Co' AS "Acorn",
'Cheese' REGEXP '^Co' AS "Cheese";
Result:
+------+-------+--------+ | Corn | Acorn | Cheese | +------+-------+--------+ | 1 | 0 | 0 | +------+-------+--------+
Match the End of a String
In this example, the regular expression specifies that the string must end with rn
.
SELECT
'Corn' REGEXP 'rn$' AS "Corn",
'Acorn' REGEXP 'rn$' AS "Acorn",
'Cheese' REGEXP 'rn$' AS "Cheese";
Result:
+------+-------+--------+ | Corn | Acorn | Cheese | +------+-------+--------+ | 1 | 1 | 0 | +------+-------+--------+
Match Any Character
The .
character matches any character.
SELECT
'Corn' REGEXP '.' AS "Corn",
'Cheese' REGEXP '.' AS "Cheese",
'' REGEXP '.' AS "";
Result:
+------+--------+---+ | Corn | Cheese | | +------+--------+---+ | 1 | 1 | 0 | +------+--------+---+
This character is typically used in conjunction with other characters to specify further criteria. For example:
SELECT
'Corn' REGEXP '^C.rn$' AS "Corn",
'Crn' REGEXP '^C.rn$' AS "Crn";
Result:
+------+-----+ | Corn | Crn | +------+-----+ | 1 | 0 | +------+-----+
Here we specify that the string must start with C
, that it must be followed by a character (any character), and that it must end with rn
.
Note that this character specifies a single instance of the character. If you want to specify multiple instances (for example ee
instead of just e
), you’ll need to add more .
characters.
SELECT
'Tweet' REGEXP '^Tw..t$' AS "Tweet",
'Tweat' REGEXP '^Tw..t$' AS "Tweat",
'Tweet' REGEXP '^Tw.t$' AS "Tweet",
'Twit' REGEXP '^Tw..t$' AS "Twit";
Result:
+-------+-------+-------+------+ | Tweet | Tweat | Tweet | Twit | +-------+-------+-------+------+ | 1 | 1 | 0 | 0 | +-------+-------+-------+------+
Another way to do this is to specify the number of occurrences within curly brackets:
SELECT
'Tweet' REGEXP '^Tw.{2}t$' AS "Tweet",
'Tweat' REGEXP '^Tw.{2}t$' AS "Tweat",
'Tweet' REGEXP '^Tw.{1}t$' AS "Tweet",
'Twit' REGEXP '^Tw.{2}t$' AS "Twit";
Result:
+-------+-------+-------+------+ | Tweet | Tweat | Tweet | Twit | +-------+-------+-------+------+ | 1 | 1 | 0 | 0 | +-------+-------+-------+------+
However, if you know the character that you’re looking for, you can specify that character (instead of the .
character), as demonstrated in the following example.
Match Zero or More Instances of a Specific Character
We can do the following to specify zero or more instances of the e
character:
SELECT
'Twet' REGEXP '^Twe*t$' AS "Twet",
'Tweet' REGEXP '^Twe*t$' AS "Tweet",
'Tweeet' REGEXP '^Twe*t$' AS "Tweeet",
'Twt' REGEXP '^Twe*t$' AS "Twt",
'Twit' REGEXP '^Twe*t$' AS "Twit",
'Twiet' REGEXP '^Twe*t$' AS "Twiet",
'Tweit' REGEXP '^Twe*t$' AS "Tweit";
Result:
+------+-------+--------+-----+------+-------+-------+ | Twet | Tweet | Tweeet | Twt | Twit | Twiet | Tweit | +------+-------+--------+-----+------+-------+-------+ | 1 | 1 | 1 | 1 | 0 | 0 | 0 | +------+-------+--------+-----+------+-------+-------+
The first four match but the last three don’t.
Match One or More Instances of a Specific Character
We can modify the previous example so that we only get a match if one or more characters is found (the previous example returned a match if zero or more were found). To do this, we simply use +
instead of *
:
SELECT
'Twet' REGEXP '^Twe+t$' AS "Twet",
'Tweet' REGEXP '^Twe+t$' AS "Tweet",
'Tweeet' REGEXP '^Twe+t$' AS "Tweeet",
'Twt' REGEXP '^Twe+t$' AS "Twt",
'Twit' REGEXP '^Twe+t$' AS "Twit",
'Twiet' REGEXP '^Twe+t$' AS "Twiet",
'Tweit' REGEXP '^Twe+t$' AS "Tweit";
Result:
+------+-------+--------+-----+------+-------+-------+ | Twet | Tweet | Tweeet | Twt | Twit | Twiet | Tweit | +------+-------+--------+-----+------+-------+-------+ | 1 | 1 | 1 | 0 | 0 | 0 | 0 | +------+-------+--------+-----+------+-------+-------+
In this case, the fourth word returns a different result to the previous example.
Match Zero or One Instance of a Specific Character
We can modify the previous example so that we only get a match on zero or one of the desired characters. To do this, we use ?
:
SELECT
'Twet' REGEXP '^Twe?t$' AS "Twet",
'Tweet' REGEXP '^Twe?t$' AS "Tweet",
'Tweeet' REGEXP '^Twe?t$' AS "Tweeet",
'Twt' REGEXP '^Twe?t$' AS "Twt",
'Twit' REGEXP '^Twe?t$' AS "Twit",
'Twiet' REGEXP '^Twe?t$' AS "Twiet",
'Tweit' REGEXP '^Twe?t$' AS "Tweit";
Result:
+------+-------+--------+-----+------+-------+-------+ | Twet | Tweet | Tweeet | Twt | Twit | Twiet | Tweit | +------+-------+--------+-----+------+-------+-------+ | 1 | 0 | 0 | 1 | 0 | 0 | 0 | +------+-------+--------+-----+------+-------+-------+
Alternation
You can use the |
character to match one or another sequence of characters:
SELECT
'Tweet' REGEXP 'Tw|et' AS "Tweet",
'For Let' REGEXP 'Tw|et' AS "For Let",
'Banana' REGEXP 'Tw|et' AS "Banana";
Result:
+-------+---------+--------+ | Tweet | For Let | Banana | +-------+---------+--------+ | 1 | 1 | 0 | +-------+---------+--------+
Here’s another example where I search for whole words:
SELECT
'Cat' REGEXP 'Cat|Dog' AS "Cat",
'Dog' REGEXP 'Cat|Dog' AS "Dog",
'Doggone' REGEXP 'Cat|Dog' AS "Doggone",
'Banana' REGEXP 'Cat|Dog' AS "Banana";
Result:
+-----+-----+---------+--------+ | Cat | Dog | Doggone | Banana | +-----+-----+---------+--------+ | 1 | 1 | 1 | 0 | +-----+-----+---------+--------+
We still get a match even when our regular expression matches only part of the the string.
Match Zero or More Instances of a Sequence
You can use brackets along with the asterisk ()*
to specify zero or more instances of a sequence:
SELECT
'Banana' REGEXP '(an)*' AS "Banana",
'Land' REGEXP '(an)*' AS "Land",
'Cheese' REGEXP '(an)*' AS "Cheese";
Result:
+--------+------+--------+ | Banana | Land | Cheese | +--------+------+--------+ | 1 | 1 | 1 | +--------+------+--------+
Another example:
SELECT
'Banana' REGEXP '^B(an)*d$' AS "Banana",
'Band' REGEXP '^B(an)*d$' AS "Band",
'Bald' REGEXP '^B(an)*d$' AS "Bald",
'Bad' REGEXP '^B(an)*d$' AS "Bad";
Result:
+--------+------+------+-----+ | Banana | Band | Bald | Bad | +--------+------+------+-----+ | 0 | 1 | 0 | 0 | +--------+------+------+-----+
Repetition
As seen in a previous example, you can use curly brackets to specify repetition. This notation provides a more general way of writing regular expressions than some of the previous examples:
SELECT
'Tweeet' REGEXP 'e{3}' AS "Tweeet",
'Tweet' REGEXP 'e{3}' AS "Tweet";
Result:
+--------+-------+ | Tweeet | Tweet | +--------+-------+ | 1 | 0 | +--------+-------+
Range
You can use the hyphen character to specify a range. Here’s an example that specifies a range of numbers:
SELECT
'Tweet 123' REGEXP '[0-9]' AS "Tweet 123",
'Tweet ABC' REGEXP '[0-9]' AS "Tweet ABC";
Result:
+--------+-------+ | Tweeet | Tweet | +--------+-------+ | 1 | 0 | +--------+-------+
And the following example specifies a range of letters:
SELECT
'Tweet 123' REGEXP '[A-Z]' AS "Tweet 123",
'ABC' REGEXP '[A-Z]' AS "ABC",
'123' REGEXP '[A-Z]' AS "123";
Result:
+-----------+-----+-----+ | Tweet 123 | ABC | 123 | +-----------+-----+-----+ | 1 | 1 | 0 | +-----------+-----+-----+
Here’s what happens if we limit the range of numbers:
SELECT
'123' REGEXP '[1-3]' AS "123",
'012' REGEXP '[1-3]' AS "012",
'045' REGEXP '[1-3]' AS "045";
Result:
+-----+-----+-----+ | 123 | 012 | 045 | +-----+-----+-----+ | 1 | 1 | 0 | +-----+-----+-----+
Not Within a Range
We can use the ^
character to modify the previous example so that the specified range of characters are excluded:
SELECT
'123' REGEXP '[^1-3]' AS "123",
'012' REGEXP '[^1-3]' AS "012",
'045' REGEXP '[^1-3]' AS "045";
Result:
+-----+-----+-----+ | 123 | 012 | 045 | +-----+-----+-----+ | 0 | 1 | 1 | +-----+-----+-----+
In this case we get the opposite result to the previous example.
Null Values
If either the expression or the pattern is null
, the result is null
:
SELECT
null REGEXP 'Corn' AS "Corn",
'Acorn' REGEXP null AS "Acorn",
null REGEXP null AS "Corner";
Result:
+------+-------+--------+ | Corn | Acorn | Corner | +------+-------+--------+ | NULL | NULL | NULL | +------+-------+--------+