Below are two methods for returning rows that contain only non-alphanumeric characters in MariaDB.
Non-alphanumeric characters include punctuation characters like !@#&()–[{}]:;',?/*
and symbols like `~$^+=<>“
, as well as whitespace characters like the space or tab characters.
Sample Data
We’ll use the following data for our examples:
CREATE TABLE t1 (
c1 varchar(255) NULL
);
INSERT INTO t1 VALUES
('Music'),
('Live Music'),
('Café'),
('Café Del Mar'),
('100 Cafés'),
('[email protected]'),
('1 + 1'),
('()'),
('!@#&()–[{}]:;'',?/*'),
('`~$^+=<>“'),
('$1.50'),
('Player 456'),
('007'),
(null),
(''),
('é'),
('É'),
('é 123'),
('ø'),
('ø 123');
SELECT c1 FROM t1;
Result:
+----------------------+ | c1 | +----------------------+ | Music | | Live Music | | Café | | Café Del Mar | | 100 Cafés | | [email protected] | | 1 + 1 | | () | | !@#&()–[{}]:;',?/* | | `~$^+=<>“ | | $1.50 | | Player 456 | | 007 | | NULL | | | | é | | É | | é 123 | | ø | | ø 123 | +----------------------+
Option 1: Compare to [:alnum:]
MariaDB’s regular expression capability includes support for the POSIX character classes. Therefore, we can use the [:alnum:]
POSIX character class in our regular expressions to find the rows that contain only alphanumeric characters, then negate that with the NOT
operator.
SELECT c1 FROM t1
WHERE c1 NOT REGEXP '[[:alnum:]]';
Result:
+----------------------+ | c1 | +----------------------+ | () | | !@#&()–[{}]:;',?/* | | `~$^+=<>“ | | | +----------------------+
If you need to exclude a specific character, you can always use the NULLIF()
function.
For example, let’s exclude the empty string from the last row:
SELECT c1 FROM t1
WHERE NULLIF(c1, '') NOT REGEXP '[[:alnum:]]';
Result:
+----------------------+ | c1 | +----------------------+ | () | | !@#&()–[{}]:;',?/* | | `~$^+=<>“ | +----------------------+
Option 2: Specify a Range of Characters
Another way to do it is to specify a range of characters within your regular expression.
Example:
SELECT c1 FROM t1
WHERE c1 NOT REGEXP '[A-Za-z0-9]';
Result:
+----------------------+ | c1 | +----------------------+ | () | | !@#&()–[{}]:;',?/* | | `~$^+=<>“ | | | | é | | É | | ø | +----------------------+
And to remove the empty string:
SELECT c1 FROM t1
WHERE NULLIF(c1, '') NOT REGEXP '[A-Za-z0-9]';
Result:
+----------------------+ | c1 | +----------------------+ | () | | !@#&()–[{}]:;',?/* | | `~$^+=<>“ | | é | | É | | ø | +----------------------+
In this case, my exclusion range didn’t cover alphanumeric characters like é
, É
, and ø
, and so the output is not a true representation of non-alphanumeric characters. That said, at least this method gives you the option to specify the exact characters you want to include or exclude from the result.