PostgreSQL has a unistr()
function that evaluates escaped Unicode characters in its argument and returns the result as text.
Example
Here’s an example to demonstrate:
SELECT unistr('\0063\0061\0066\00E9');
Result:
café
Unicode characters can be specified as \XXXX
(4 hexadecimal digits), \+XXXXXX
(6 hexadecimal digits), \uXXXX
(4 hexadecimal digits), or \UXXXXXXXX
(8 hexadecimal digits).
Therefore, we can do any of the following to get the same result as the above example:
SELECT
unistr('\0063\0061\0066\00E9') AS "\XXXX",
unistr('\+000063\+000061\+000066\+0000E9') AS "+XXXXXX",
unistr('\u0063\u0061\u0066\u00E9') AS "\uXXXX",
unistr('\U00000063\U00000061\U00000066\U000000E9') AS "\UXXXXXXXX";
Result:
\XXXX | +XXXXXX | \uXXXX | \UXXXXXXXX
-------+---------+--------+------------
café | café | café | café
If we want to include a backslash, we can escape it with another backslash. In other words, to include a backslash use two backslashes.
If the server encoding is not UTF-8, the Unicode code point identified by one of these escape sequences is converted to the actual server encoding. If it’s not possible to do that, then an error occurs.
Null Argument
If the argument is null
then null
is returned:
SELECT unistr(null);
Result:
null
Wrong Argument Type
Passing an argument of the wrong type results in an error:
SELECT unistr(123);
Result:
ERROR: function unistr(integer) does not exist
LINE 1: SELECT unistr(123);
^
HINT: No function matches the given name and argument types. You might need to add explicit type casts.