Set the Character Set and Collation of a Table in MariaDB

You can specify the character set and collation at various levels in MariaDB. You can specify them at the connection level, the server level, the database level, the table level, and even the column level.

You can also specify a collation in your queries so that it overrides any collation that has been previously specified at the aforementioned levels.

To set the character set and collation at the table level, use the CREATE TABLE statement or ALTER TABLE statement (depending on whether you’re creating the table or modifying it).

When Creating the Table

Here’s an example of setting the collation and character set when creating the table:

CREATE TABLE Products ( 
    ProductId CHAR(10), 
    ProductName VARCHAR(255) NOT NULL, 
    PRIMARY KEY (ProductId)) 
    CHARACTER SET latin1 
    COLLATE latin1_general_ci;

That creates a table called Products with a latin1 character set, and a collation of latin1_general_ci.

Alter an Existing Table

Here’s an example of changing the collation and character set on an existing table:

ALTER TABLE Products 
CONVERT TO CHARACTER SET utf8 COLLATE utf8_spanish_ci;

It now uses the new collation and character set.

It’s also possible to omit the CONVERT TO part:

ALTER TABLE Products 
CHARACTER SET utf8 
COLLATE utf8_spanish_ci;

Collation Must be Valid for the Character Set

When setting both the collation and the character set, you need to ensure that the collation is valid for the chosen character set and vice/versa. If not, an error is returned.

Here’s an example of specifying using an invalid collation for the specified character set:

ALTER TABLE Products 
CHARACTER SET latin1 
COLLATE utf8_spanish_ci;

Result:

ERROR 1253 (42000): COLLATION 'utf8_spanish_ci' is not valid for CHARACTER SET 'latin1'

It tells us that our chosen collation is not valid for our chosen specified character set.

See Full List of Collations Supported by MariaDB or Full List of Character Sets Supported by MariaDB for a list of matching collations and character sets.

Specifying Only the Collation or Character Set

It’s OK to specify just the collation or just the character set (or neither):

  • If only the character set is provided, the default collation for that character set will be used.
  • If only the collation is provided, the associated character set will be used.
  • If neither collation nor character set is provided, the table default is used.

Converting to a Character Set that Requires More Bytes

For VARCHAR or TEXT columns, CONVERT TO CHARACTER SET changes the data type if required to ensure that the new column is long enough to store as many characters as the original column.

For example, an ascii TEXT column requires a single byte per character, so the column can hold up to 65,535 characters. If the column is converted to utf8, up to 3 bytes can be required for each character, so the column will be converted to MEDIUMTEXT (which has a maximum length of 16,777,215) so that it will be able to hold the same number of characters.

If you don’t want such data type changes to take place, use MODIFY on the individual columns instead.

For example:

ALTER TABLE Products 
MODIFY ProductName VARCHAR(255) 
CHARACTER SET utf8; 

Converting to Binary

CONVERT TO CHARACTER SET binary will convert CHARVARCHAR and TEXT columns to BINARY, VARBINARY and BLOB respectively.

Once this has been done, such columns will no longer have a character set. Also, they will no longer be affected by future CONVERT TO CHARACTER SET statements.

Checking the Collation and Character Sets in MariaDB

There are various ways to check the collation and character sets in MariaDB, depending on the level that you’re checking (server, database, table, column, connection, etc). See Show the Collation in MariaDB for examples.