An Overview of the SHOW CHARACTER SET Statement in MySQL

In MySQL we can use the SHOW CHARACTER SET statement to return all available character sets. We can also use it to narrow the output down to just those we’re interested in.

Syntax

The syntax goes like this:

SHOW {CHARACTER SET | CHARSET}
    [LIKE 'pattern' | WHERE expr]

So we can use the optional LIKE and/or WHERE clauses to narrow down the results.

We also have the option of using SHOW CHARSET as a shortcut for SHOW CHARACTER SET.

Example

We can return all available character sets like this:

SHOW CHARACTER SET;

Result:

+----------+---------------------------------+---------------------+--------+
| Charset  | Description                     | Default collation   | Maxlen |
+----------+---------------------------------+---------------------+--------+
| armscii8 | ARMSCII-8 Armenian              | armscii8_general_ci |      1 |
| ascii    | US ASCII                        | ascii_general_ci    |      1 |
| big5     | Big5 Traditional Chinese        | big5_chinese_ci     |      2 |
| binary   | Binary pseudo charset           | binary              |      1 |
| cp1250   | Windows Central European        | cp1250_general_ci   |      1 |
| cp1251   | Windows Cyrillic                | cp1251_general_ci   |      1 |
| cp1256   | Windows Arabic                  | cp1256_general_ci   |      1 |
| cp1257   | Windows Baltic                  | cp1257_general_ci   |      1 |
| cp850    | DOS West European               | cp850_general_ci    |      1 |
| cp852    | DOS Central European            | cp852_general_ci    |      1 |
| cp866    | DOS Russian                     | cp866_general_ci    |      1 |
| cp932    | SJIS for Windows Japanese       | cp932_japanese_ci   |      2 |
| dec8     | DEC West European               | dec8_swedish_ci     |      1 |
| eucjpms  | UJIS for Windows Japanese       | eucjpms_japanese_ci |      3 |
| euckr    | EUC-KR Korean                   | euckr_korean_ci     |      2 |
| gb18030  | China National Standard GB18030 | gb18030_chinese_ci  |      4 |
| gb2312   | GB2312 Simplified Chinese       | gb2312_chinese_ci   |      2 |
| gbk      | GBK Simplified Chinese          | gbk_chinese_ci      |      2 |
| geostd8  | GEOSTD8 Georgian                | geostd8_general_ci  |      1 |
| greek    | ISO 8859-7 Greek                | greek_general_ci    |      1 |
| hebrew   | ISO 8859-8 Hebrew               | hebrew_general_ci   |      1 |
| hp8      | HP West European                | hp8_english_ci      |      1 |
| keybcs2  | DOS Kamenicky Czech-Slovak      | keybcs2_general_ci  |      1 |
| koi8r    | KOI8-R Relcom Russian           | koi8r_general_ci    |      1 |
| koi8u    | KOI8-U Ukrainian                | koi8u_general_ci    |      1 |
| latin1   | cp1252 West European            | latin1_swedish_ci   |      1 |
| latin2   | ISO 8859-2 Central European     | latin2_general_ci   |      1 |
| latin5   | ISO 8859-9 Turkish              | latin5_turkish_ci   |      1 |
| latin7   | ISO 8859-13 Baltic              | latin7_general_ci   |      1 |
| macce    | Mac Central European            | macce_general_ci    |      1 |
| macroman | Mac West European               | macroman_general_ci |      1 |
| sjis     | Shift-JIS Japanese              | sjis_japanese_ci    |      2 |
| swe7     | 7bit Swedish                    | swe7_swedish_ci     |      1 |
| tis620   | TIS620 Thai                     | tis620_thai_ci      |      1 |
| ucs2     | UCS-2 Unicode                   | ucs2_general_ci     |      2 |
| ujis     | EUC-JP Japanese                 | ujis_japanese_ci    |      3 |
| utf16    | UTF-16 Unicode                  | utf16_general_ci    |      4 |
| utf16le  | UTF-16LE Unicode                | utf16le_general_ci  |      4 |
| utf32    | UTF-32 Unicode                  | utf32_general_ci    |      4 |
| utf8mb3  | UTF-8 Unicode                   | utf8mb3_general_ci  |      3 |
| utf8mb4  | UTF-8 Unicode                   | utf8mb4_0900_ai_ci  |      4 |
+----------+---------------------------------+---------------------+--------+
41 rows in set (0.01 sec)

The LIKE Clause

Here’s an example that uses the LIKE clause to return all character sets that start with utf:

SHOW CHARACTER SET LIKE 'utf%';

Result:

+---------+------------------+--------------------+--------+
| Charset | Description      | Default collation  | Maxlen |
+---------+------------------+--------------------+--------+
| utf16   | UTF-16 Unicode   | utf16_general_ci   |      4 |
| utf16le | UTF-16LE Unicode | utf16le_general_ci |      4 |
| utf32   | UTF-32 Unicode   | utf32_general_ci   |      4 |
| utf8mb3 | UTF-8 Unicode    | utf8mb3_general_ci |      3 |
| utf8mb4 | UTF-8 Unicode    | utf8mb4_0900_ai_ci |      4 |
+---------+------------------+--------------------+--------+
5 rows in set (0.00 sec)

The WHERE Clause

Here’s an example that uses the WHERE clause to return all character sets with a Maxlen of 4:

SHOW CHARACTER SET WHERE Maxlen = '4';

Result:

+---------+---------------------------------+--------------------+--------+
| Charset | Description                     | Default collation  | Maxlen |
+---------+---------------------------------+--------------------+--------+
| gb18030 | China National Standard GB18030 | gb18030_chinese_ci |      4 |
| utf16   | UTF-16 Unicode                  | utf16_general_ci   |      4 |
| utf16le | UTF-16LE Unicode                | utf16le_general_ci |      4 |
| utf32   | UTF-32 Unicode                  | utf32_general_ci   |      4 |
| utf8mb4 | UTF-8 Unicode                   | utf8mb4_0900_ai_ci |      4 |
+---------+---------------------------------+--------------------+--------+
5 rows in set (0.00 sec)

The AND and OR Keywords

Even though it’s not explicitly written into the syntax for this statement, we can combine WHERE conditions with the AND and/or OR keywords:

SHOW CHARACTER SET 
WHERE Maxlen = '4'
OR Description LIKE '%Unicode%';

Result:

+---------+---------------------------------+--------------------+--------+
| Charset | Description                     | Default collation  | Maxlen |
+---------+---------------------------------+--------------------+--------+
| gb18030 | China National Standard GB18030 | gb18030_chinese_ci |      4 |
| ucs2    | UCS-2 Unicode                   | ucs2_general_ci    |      2 |
| utf16   | UTF-16 Unicode                  | utf16_general_ci   |      4 |
| utf16le | UTF-16LE Unicode                | utf16le_general_ci |      4 |
| utf32   | UTF-32 Unicode                  | utf32_general_ci   |      4 |
| utf8mb3 | UTF-8 Unicode                   | utf8mb3_general_ci |      3 |
| utf8mb4 | UTF-8 Unicode                   | utf8mb4_0900_ai_ci |      4 |
+---------+---------------------------------+--------------------+--------+
7 rows in set (0.00 sec)

Another way to get character set information is by querying the information_schema.character_sets table.