MySQL length() vs char_length()

In MySQL, there are many times where the length() function and the char_length() function will provide exactly the same results. However, there are also times where the results will be completely different. Here’s why.

First, here’s the definition for each of these functions:

char_length()
Returns the length of a string, measured in characters.
length()
Returns the length of a string, measured in bytes.

Notice “characters” vs “bytes” – one is measured in characters, the other is measured in bytes.

In many cases, the number of bytes will be the same as the number of characters in the string, but this isn’t always the case. The number of bytes used per character depends on how the data is stored. For example, if the string is stored as Unicode data, there will be 2 bytes per character.

Here’s a basic example using ASCII text (where both functions return the same result):

SELECT 
    CHAR_LENGTH('Lit'), 
    LENGTH('Lit');

Result:

+--------------------+---------------+
| CHAR_LENGTH('Lit') | LENGTH('Lit') |
+--------------------+---------------+
|                  3 |             3 |
+--------------------+---------------+

And often we will get the same result if the string is stored in a database:

SELECT 
    CHAR_LENGTH(ArtistName),
    LENGTH(ArtistName)    
FROM Artists
WHERE ArtistName = 'Lit';

Result:

+-------------------------+--------------------+
| CHAR_LENGTH(ArtistName) | LENGTH(ArtistName) |
+-------------------------+--------------------+
|                       3 |                  3 |
+-------------------------+--------------------+

However, if we change the database column to store the data as unicode:

ALTER TABLE Artists 
MODIFY COLUMN ArtistName VARCHAR(255) unicode;

And then run the same query again:

SELECT 
    CHAR_LENGTH(ArtistName),
    LENGTH(ArtistName)    
FROM Artists
WHERE ArtistName = 'Lit';

We get a different result:

+-------------------------+--------------------+
| CHAR_LENGTH(ArtistName) | LENGTH(ArtistName) |
+-------------------------+--------------------+
|                       3 |                  6 |
+-------------------------+--------------------+

This is because unicode stores each character as 2 bytes.

This is similar to the difference between datalength() and len() in T-SQL.

MySQL also has the octet_length() function, which is a synonym for length().