How WEIGHT_STRING() Works in MariaDB

In MariaDB, WEIGHT_STRING() is a built-in function that returns the weight string for the input string. The return value is a binary string that represents the comparison and sorting value of the string.

If the input string is a nonbinary value, the return value contains the string’s collation weight. If it’s a binary value, the result is the same as the input string. This is because the weight for each byte in a binary string is the byte value.

This function is a debugging function intended for internal use. It can be used for testing and debugging of collations.

Syntax

The syntax goes like this:

WEIGHT_STRING(str [AS {CHAR|BINARY}(N)] [LEVEL levels] [flags])
  levels: N [ASC|DESC|REVERSE] [, N [ASC|DESC|REVERSE]] ... 

Below is a description of each argument/clause.

The AS Clause

The optional AS clause permits casting the input string to a binary or non-binary string, as well as to a particular length.

  • AS BINARY(N) measures the length in bytes rather than characters, and right pads with 0x00 bytes to the desired length.
  • AS CHAR(N) measures the length in characters, and right pads with spaces to the desired length.

N has a minimum value of 1, and if it is less than the length of the input string, the string is truncated without warning.

The LEVEL Clause

Specifies that the return value should contain weights for specific collation levels.

The levels specifier can either be a single integer, a comma-separated list of integers, or a range of integers separated by a dash (whitespace is ignored). Integers can range from 1 to a maximum of 6, dependent on the collation, and need to be listed in ascending order.

  • If the LEVEL clause is not provided, a default of 1 to the maximum for the collation is assumed.
  • If the LEVEL is specified without using a range, an optional modifier is permitted.
  • ASC (the default) returns the weights without any modification.
  • DESC returns bitwise-inverted weights.
  • REVERSE returns the weights in reverse order.

Example

Here’s a basic example:

SELECT HEX(WEIGHT_STRING('z'));

Result:

+-------------------------+
| HEX(WEIGHT_STRING('z')) |
+-------------------------+
| 005A                    |
+-------------------------+

Here, we use the HEX() function to represent non-printable results in hexadecimal format.

The AS Clause

Here’s an example using the AS clause to cast the input string to a given type and length.

SET @str = 'z';
SELECT 
  HEX(WEIGHT_STRING(@str AS CHAR(3))) 'Char 3',
  HEX(WEIGHT_STRING(@str AS CHAR(8))) 'Char 8',
  HEX(WEIGHT_STRING(@str AS BINARY(3))) 'Binary 3',
  HEX(WEIGHT_STRING(@str AS BINARY(8))) 'Binary 8';

Result (using vertical output):

  Char 3: 005A00200020
  Char 8: 005A0020002000200020002000200020
Binary 3: 7A0000
Binary 8: 7A00000000000000

Collation

The following two examples demonstrate how a string can have a different weight string, depending on the collation being used.

The collation used in the first example is case-insensitive. The collation used in the second example is case-sensitive.

SET @upper = _latin1 'PLAY' COLLATE latin1_general_ci;
SET @lower = lcase(@upper);
SELECT 
  @upper 'String', 
  HEX(@upper) 'Hex', 
  HEX(WEIGHT_STRING(@upper)) 'Weight String'
UNION ALL
SELECT 
  @lower, 
  HEX(@lower), 
  HEX(WEIGHT_STRING(@lower));

Result:

+--------+----------+---------------+
| String | Hex      | Weight String |
+--------+----------+---------------+
| PLAY   | 504C4159 | 8F7941AA      |
| play   | 706C6179 | 8F7941AA      |
+--------+----------+---------------+

And here’s the same example, except with a case-sensitive collation.

SET @upper = _latin1 'PLAY' COLLATE latin1_general_cs;
SET @lower = lcase(@upper);
SELECT 
  @upper 'String', 
  HEX(@upper) 'Hex', 
  HEX(WEIGHT_STRING(@upper)) 'Weight String'
UNION ALL
SELECT 
  @lower, 
  HEX(@lower), 
  HEX(WEIGHT_STRING(@lower));

Result:

+--------+----------+---------------+
| String | Hex      | Weight String |
+--------+----------+---------------+
| PLAY   | 504C4159 | 8F7941AA      |
| play   | 706C6179 | 907A42AB      |
+--------+----------+---------------+

Null Arguments

Passing null returns null:

SELECT WEIGHT_STRING(null);

Result:

+---------------------+
| WEIGHT_STRING(null) |
+---------------------+
| NULL                |
+---------------------+

Missing Arguments

Calling WEIGHT_STRING() with the wrong number of arguments, or without passing any arguments results in an error:

SELECT WEIGHT_STRING();

Result:

ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MariaDB server version for the right syntax to use near ')' at line 1