Set the Character Set and Collation of a Column in MariaDB

You can specify the character set and collation at various levels in MariaDB. You can specify them at the connection level, the server level, the database level, the table level, and even the column level.

You can also specify a collation in your queries so that it overrides any collation that has been previously specified at the aforementioned levels.

To set the character set and collation at the column level, use the CREATE TABLE statement or ALTER TABLE statement (depending on whether you’re creating the table or modifying it).

When Creating the Table

Here’s an example of setting the collation and character set on a column when creating the table:

CREATE TABLE Events ( 
    EventId INT AUTO_INCREMENT NOT NULL, 
    EventName VARCHAR(255)
        CHARACTER SET utf8
        COLLATE utf8_spanish_ci, 
    PRIMARY KEY (EventId)
    );

Here, I specified the character set and collation at the column level. It’s also possible to set them at the table level.

Alter an Existing Column

Here’s an example of changing the collation and character set on an existing column:

ALTER TABLE Events 
MODIFY EventName VARCHAR(255) 
CHARACTER SET latin5 COLLATE latin5_turkish_ci;

The EventName column now uses the new collation and character set.

Check the Column

There are several ways to check the collation and character set of a column. Here’s one:

SELECT
    column_name, 
    character_set_name, 
    collation_name 
FROM information_schema.columns
WHERE table_name = 'Events'
AND column_name = 'EventName';

Result:

+-------------+--------------------+-------------------+
| column_name | character_set_name | collation_name    |
+-------------+--------------------+-------------------+
| EventName   | latin5             | latin5_turkish_ci |
+-------------+--------------------+-------------------+

Collation Must be Valid for the Character Set

When setting both the collation and the character set, you need to ensure that the collation is valid for the chosen character set and vice/versa. If not, an error is returned.

Here’s an example of specifying using an invalid collation for the specified character set:

ALTER TABLE Events 
MODIFY EventName VARCHAR(255) 
CHARACTER SET utf16 COLLATE latin5_turkish_ci;

Result:

ERROR 1253 (42000): COLLATION 'latin5_turkish_ci' is not valid for CHARACTER SET 'utf16'

It tells us that our chosen collation is not valid for our chosen specified character set.

See Full List of Collations Supported by MariaDB or Full List of Character Sets Supported by MariaDB for a list of matching collations and character sets.

Specifying Only the Collation or Character Set

It’s OK to specify just the collation or just the character set (or neither):

  • If only the character set is provided, the default collation for that character set will be used.
  • If only the collation is provided, the associated character set will be used.
  • If neither collation nor character set is provided, the table default is used.

Converting to a Character Set that Requires More Bytes

It’s also possible to convert the character set at the table level. When converting the character set at the table level, there’s a chance that MariaDB may automatically convert the data type to one that’s better able to handle the new character set.

However, when we use MODIFY on the individual columns and explicitly specify the data type (like we did in the previous examples), we prevent MariaDB from converting the data type.

Converting to Binary

CONVERT TO CHARACTER SET binary will convert CHARVARCHAR and TEXT columns to BINARY, VARBINARY and BLOB respectively.

Once this has been done, such columns will no longer have a character set. Also, they will no longer be affected by future CONVERT TO CHARACTER SET statements.

Checking the Collation and Character Sets in MariaDB

There are various ways to check the collation and character sets in MariaDB, depending on the level that you’re checking (server, database, table, column, connection, etc). See Show the Collation in MariaDB for examples.