In MySQL, the REGEXP
operator is used to determine whether or not a string matches a regular expression. It’s a synonym for REGEXP_LIKE()
.
If the string matches the regular expression provided, the result is 1
, otherwise it’s 0
.
Syntax
The syntax goes like this:
expr REGEXP pat
Where expr
is the input string and pat
is the regular expression for which you’re testing the string against.
Example 1 – Basic Regular Expression
The most basic regular expression we can use is one that has no special characters in it. Here, we just use a string. If any part of the input string matches that string, it returns a match.
SELECT 'Corn' REGEXP 'Corn' AS 'Corn', 'Acorn' REGEXP 'Corn' AS 'Acorn', 'Corner' REGEXP 'Corn' AS 'Corner', 'Cheese' REGEXP 'Corn' AS 'Cheese';
Result:
+------+-------+--------+--------+ | Corn | Acorn | Corner | Cheese | +------+-------+--------+--------+ | 1 | 1 | 1 | 0 | +------+-------+--------+--------+
Example 2 – Match the Beginning of a String
In this example, the regular expression specifies that the string must begin with Co
.
SELECT 'Corn' REGEXP '^Co' AS 'Corn', 'Acorn' REGEXP '^Co' AS 'Acorn', 'Cheese' REGEXP '^Co' AS 'Cheese';
Result:
+------+-------+--------+ | Corn | Acorn | Cheese | +------+-------+--------+ | 1 | 0 | 0 | +------+-------+--------+
Example 3 – Match the End of a String
In this example, the regular expression specifies that the string must end with rn
.
SELECT 'Corn' REGEXP 'rn$' AS 'Corn', 'Acorn' REGEXP 'rn$' AS 'Acorn', 'Cheese' REGEXP 'rn$' AS 'Cheese';
Result:
+------+-------+--------+ | Corn | Acorn | Cheese | +------+-------+--------+ | 1 | 1 | 0 | +------+-------+--------+
Example 4 – Match Any Character
The .
character matches any character.
SELECT 'Corn' REGEXP '.' AS 'Corn', 'Cheese' REGEXP '.' AS 'Cheese', '' REGEXP '.' AS '';
Result:
+------+--------+---+ | Corn | Cheese | | +------+--------+---+ | 1 | 1 | 0 | +------+--------+---+
This character is typically used in conjunction with other characters to specify further criteria. For example:
SELECT 'Corn' REGEXP '^C.rn$' AS 'Corn', 'Crn' REGEXP '^C.rn$' AS 'Crn';
Result:
+------+-----+ | Corn | Crn | +------+-----+ | 1 | 0 | +------+-----+
Here we specify that the string must start with C
, that it must be followed by a character (any character), and that it must end with rn
.
Note that this character specifies a single instance of the character. If you want to specify multiple instances (for example ee
instead of just e
), you’ll need to add more .
characters.
SELECT 'Tweet' REGEXP '^Tw..t$' AS 'Tweet', 'Tweat' REGEXP '^Tw..t$' AS 'Tweat', 'Tweet' REGEXP '^Tw.t$' AS 'Tweet', 'Twit' REGEXP '^Tw..t$' AS 'Twit';
Result:
+-------+-------+-------+------+ | Tweet | Tweat | Tweet | Twit | +-------+-------+-------+------+ | 1 | 1 | 0 | 0 | +-------+-------+-------+------+
Another way to do this is to specify the number of occurrences within curly brackets:
SELECT 'Tweet' REGEXP '^Tw.{2}t$' AS 'Tweet', 'Tweat' REGEXP '^Tw.{2}t$' AS 'Tweat', 'Tweet' REGEXP '^Tw.{1}t$' AS 'Tweet', 'Twit' REGEXP '^Tw.{2}t$' AS 'Twit';
Result:
+-------+-------+-------+------+ | Tweet | Tweat | Tweet | Twit | +-------+-------+-------+------+ | 1 | 1 | 0 | 0 | +-------+-------+-------+------+
However, if you know the character that you’re looking for, you can specify that character (instead of the .
character), as demonstrated in the following example.
Example 5 – Match Zero or More Instances of a Specific Character
We can do the following to specify zero or more instances of the e
character:
SELECT 'Twet' REGEXP '^Twe*t$' AS 'Twet', 'Tweet' REGEXP '^Twe*t$' AS 'Tweet', 'Tweeet' REGEXP '^Twe*t$' AS 'Tweeet', 'Twt' REGEXP '^Twe*t$' AS 'Twt', 'Twit' REGEXP '^Twe*t$' AS 'Twit', 'Twiet' REGEXP '^Twe*t$' AS 'Twiet', 'Tweit' REGEXP '^Twe*t$' AS 'Tweit';
Result:
+------+-------+--------+-----+------+-------+-------+ | Twet | Tweet | Tweeet | Twt | Twit | Twiet | Tweit | +------+-------+--------+-----+------+-------+-------+ | 1 | 1 | 1 | 1 | 0 | 0 | 0 | +------+-------+--------+-----+------+-------+-------+
The first four match but the last three don’t.
Example 6 – Match One or More Instances of a Specific Character
We can modify the previous example so that we only get a match if one or more characters is found (the previous example returned a match if zero or more were found). To do this, we simply use +
instead of *
:
SELECT 'Twet' REGEXP '^Twe+t$' AS 'Twet', 'Tweet' REGEXP '^Twe+t$' AS 'Tweet', 'Tweeet' REGEXP '^Twe+t$' AS 'Tweeet', 'Twt' REGEXP '^Twe+t$' AS 'Twt', 'Twit' REGEXP '^Twe+t$' AS 'Twit', 'Twiet' REGEXP '^Twe+t$' AS 'Twiet', 'Tweit' REGEXP '^Twe+t$' AS 'Tweit';
Result:
+------+-------+--------+-----+------+-------+-------+ | Twet | Tweet | Tweeet | Twt | Twit | Twiet | Tweit | +------+-------+--------+-----+------+-------+-------+ | 1 | 1 | 1 | 0 | 0 | 0 | 0 | +------+-------+--------+-----+------+-------+-------+
In this case, the fourth word returns a different result to the previous example.
Example 7 – Match Zero or One Instance of a Specific Character
We can modify the previous example so that we only get a match on zero or one of the desired characters. To do this, we use ?
:
SELECT 'Twet' REGEXP '^Twe?t$' AS 'Twet', 'Tweet' REGEXP '^Twe?t$' AS 'Tweet', 'Tweeet' REGEXP '^Twe?t$' AS 'Tweeet', 'Twt' REGEXP '^Twe?t$' AS 'Twt', 'Twit' REGEXP '^Twe?t$' AS 'Twit', 'Twiet' REGEXP '^Twe?t$' AS 'Twiet', 'Tweit' REGEXP '^Twe?t$' AS 'Tweit';
Result:
+------+-------+--------+-----+------+-------+-------+ | Twet | Tweet | Tweeet | Twt | Twit | Twiet | Tweit | +------+-------+--------+-----+------+-------+-------+ | 1 | 0 | 0 | 1 | 0 | 0 | 0 | +------+-------+--------+-----+------+-------+-------+
Example 8 – Alternation
You can use the |
character to match one or another sequence of characters:
SELECT 'Tweet' REGEXP 'Tw|et' AS 'Tweet', 'For Let' REGEXP 'Tw|et' AS 'For Let', 'Banana' REGEXP 'Tw|et' AS 'Banana';
Result:
+-------+---------+--------+ | Tweet | For Let | Banana | +-------+---------+--------+ | 1 | 1 | 0 | +-------+---------+--------+
Here’s another example where I search for whole words:
SELECT 'Cat' REGEXP 'Cat|Dog' AS 'Cat', 'Dog' REGEXP 'Cat|Dog' AS 'Dog', 'Doggone' REGEXP 'Cat|Dog' AS 'Doggone', 'Banana' REGEXP 'Cat|Dog' AS 'Banana';
Result:
+-----+-----+---------+--------+ | Cat | Dog | Doggone | Banana | +-----+-----+---------+--------+ | 1 | 1 | 1 | 0 | +-----+-----+---------+--------+
We still get a match even when our regular expression matches only part of the the string.
Example 9 – Match Zero or More Instances of a Sequence
You can use brackets along with the asterisk ()*
to specify zero or more instances of a sequence:
SELECT 'Banana' REGEXP '(an)*' AS 'Banana', 'Land' REGEXP '(an)*' AS 'Land', 'Cheese' REGEXP '(an)*' AS 'Cheese';
Result:
+--------+------+--------+ | Banana | Land | Cheese | +--------+------+--------+ | 1 | 1 | 1 | +--------+------+--------+
Another example:
SELECT 'Banana' REGEXP '^B(an)*d$' AS 'Banana', 'Band' REGEXP '^B(an)*d$' AS 'Band', 'Bald' REGEXP '^B(an)*d$' AS 'Bald', 'Bad' REGEXP '^B(an)*d$' AS 'Bad';
Result:
+--------+------+------+-----+ | Banana | Band | Bald | Bad | +--------+------+------+-----+ | 0 | 1 | 0 | 0 | +--------+------+------+-----+
Example 10 – Repetition
As seen in a previous example, you can use curly brackets to specify repetition. This notation provides a more general way of writing regular expressions than some of the previous examples:
SELECT 'Tweeet' REGEXP 'e{3}' AS 'Tweeet', 'Tweet' REGEXP 'e{3}' AS 'Tweet';
Result:
+--------+-------+ | Tweeet | Tweet | +--------+-------+ | 1 | 0 | +--------+-------+
Example 11 – Range
You can use the hyphen character to specify a range. Here’s an example that specifies a range of numbers:
SELECT 'Tweet 123' REGEXP '[0-9]' AS 'Tweet 123', 'Tweet ABC' REGEXP '[0-9]' AS 'Tweet ABC';
Result:
+--------+-------+ | Tweeet | Tweet | +--------+-------+ | 1 | 0 | +--------+-------+
And the following example specifies a range of letters:
SELECT 'Tweet 123' REGEXP '[A-Z]' AS 'Tweet 123', 'ABC' REGEXP '[A-Z]' AS 'ABC', '123' REGEXP '[A-Z]' AS '123';
Result:
+--------+-------+ | Tweeet | Tweet | +--------+-------+ | 1 | 0 | +--------+-------+
Here’s what happens if we limit the range of numbers:
SELECT '123' REGEXP '[1-3]' AS '123', '012' REGEXP '[1-3]' AS '012', '045' REGEXP '[1-3]' AS '045';
Result:
+-----+-----+-----+ | 123 | 012 | 045 | +-----+-----+-----+ | 1 | 1 | 0 | +-----+-----+-----+
Example 12 – Not Within a Range
We can use the ^
character to modify the previous example so that the specified range of characters are excluded:
SELECT '123' REGEXP '[^1-3]' AS '123', '012' REGEXP '[^1-3]' AS '012', '045' REGEXP '[^1-3]' AS '045';
Result:
+-----+-----+-----+ | 123 | 012 | 045 | +-----+-----+-----+ | 0 | 1 | 1 | +-----+-----+-----+
So we get the opposite result to the previous example.