In PostgreSQL, we can use the to_ascii()
function to convert a string to ASCII from another encoding. This usually involves dropping any accents (diacritic signs) from the original string.
We pass the string as an argument to the function. We also have the option of specifying the encoding (either its name or its number).
Example
Here’s an example to demonstrate:
SELECT to_ascii('Café');
Result:
Cafe
That was run in a database that uses Latin1 encoding.
This can be useful if we want to do comparisons while ignoring accents:
SELECT
'Café' = 'Cafe' AS "Original",
to_ascii('Café') = 'Cafe' AS "Converted";
Result:
Original | Converted
----------+-----------
f | t
The f
in the first column means false (the two strings are not equal). The t
in the second column means true (the two strings are equal).
Specify the Encoding
We also have the option of specifying the encoding as a second argument.
We can provide this as the encoding name or its number:
SELECT
to_ascii('Café', 'Latin1') AS "Name",
to_ascii('Café', 8) AS "Number";
Result:
Name | Number
------+--------
Cafe | Cafe
Supported Encodings
The to_ascii()
function supports the following encodings:
Name | Number |
---|---|
Latin1 | 8 |
Latin2 | 9 |
Latin9 | 16 |
WIN1250 | 29 |
Unsupported Encoding
Using an unsupported encoding is likely to result in an error:
SELECT to_ascii('Café');
Result:
ERROR: encoding conversion from UTF8 to ASCII not supported
In this case I ran to_ascii()
in a database that uses UTF8 encoding. The function doesn’t support UTF8 and so I got an error.