You can specify the character set and collation at various levels in MariaDB. You can specify them at the connection level, the server level, the database level, the table level, and even the column level.
You can also specify a collation in your queries so that it overrides any collation that has been previously specified at the aforementioned levels.
To set the character set and collation at the column level, use the CREATE TABLE
statement or ALTER TABLE
statement (depending on whether you’re creating the table or modifying it).
When Creating the Table
Here’s an example of setting the collation and character set on a column when creating the table:
CREATE TABLE Events (
EventId INT AUTO_INCREMENT NOT NULL,
EventName VARCHAR(255)
CHARACTER SET utf8
COLLATE utf8_spanish_ci,
PRIMARY KEY (EventId)
);
Here, I specified the character set and collation at the column level. It’s also possible to set them at the table level.
Alter an Existing Column
Here’s an example of changing the collation and character set on an existing column:
ALTER TABLE Events
MODIFY EventName VARCHAR(255)
CHARACTER SET latin5 COLLATE latin5_turkish_ci;
The EventName
column now uses the new collation and character set.
Check the Column
There are several ways to check the collation and character set of a column. Here’s one:
SELECT
column_name,
character_set_name,
collation_name
FROM information_schema.columns
WHERE table_name = 'Events'
AND column_name = 'EventName';
Result:
+-------------+--------------------+-------------------+ | column_name | character_set_name | collation_name | +-------------+--------------------+-------------------+ | EventName | latin5 | latin5_turkish_ci | +-------------+--------------------+-------------------+
Collation Must be Valid for the Character Set
When setting both the collation and the character set, you need to ensure that the collation is valid for the chosen character set and vice/versa. If not, an error is returned.
Here’s an example of specifying using an invalid collation for the specified character set:
ALTER TABLE Events
MODIFY EventName VARCHAR(255)
CHARACTER SET utf16 COLLATE latin5_turkish_ci;
Result:
ERROR 1253 (42000): COLLATION 'latin5_turkish_ci' is not valid for CHARACTER SET 'utf16'
It tells us that our chosen collation is not valid for our chosen specified character set.
See Full List of Collations Supported by MariaDB or Full List of Character Sets Supported by MariaDB for a list of matching collations and character sets.
Specifying Only the Collation or Character Set
It’s OK to specify just the collation or just the character set (or neither):
- If only the character set is provided, the default collation for that character set will be used.
- If only the collation is provided, the associated character set will be used.
- If neither collation nor character set is provided, the table default is used.
Converting to a Character Set that Requires More Bytes
It’s also possible to convert the character set at the table level. When converting the character set at the table level, there’s a chance that MariaDB may automatically convert the data type to one that’s better able to handle the new character set.
However, when we use MODIFY
on the individual columns and explicitly specify the data type (like we did in the previous examples), we prevent MariaDB from converting the data type.
Converting to Binary
CONVERT TO CHARACTER SET binary
will convert CHAR
, VARCHAR
and TEXT
columns to BINARY
, VARBINARY
and BLOB
respectively.
Once this has been done, such columns will no longer have a character set. Also, they will no longer be affected by future CONVERT TO CHARACTER SET
statements.
Checking the Collation and Character Sets in MariaDB
There are various ways to check the collation and character sets in MariaDB, depending on the level that you’re checking (server, database, table, column, connection, etc). See Show the Collation in MariaDB for examples.