How EXTRACTVALUE() Works in MariaDB

In MariaDB, EXTRACTVALUE() is a built-in string function that returns CDATA from an XML fragment.

The way it works is that it accepts two string arguments: a fragment of XML markup and an XPath expression (i.e. a locator). EXTRACTVALUE() then returns the CDATA (i.e. the text) of the first text node which is a child of the element or elements matching the XPath expression.

Syntax

The syntax goes like this:

EXTRACTVALUE(xml_frag, xpath_expr)

Where xml_frag is the XML fragment, and xpath_expr is the XPath expression to match.

Example

Here’s a basic example:

SELECT EXTRACTVALUE('<type>Cat</type>', '/type') AS "Result";

Result:

+--------+
| Result |
+--------+
| Cat    |
+--------+

In this case, the XPath is /type, and so it returns the CDATA (text) from the type element.

Here’s another one:

SELECT EXTRACTVALUE('<user>Homer<iq>Low</iq></user>', '/user') AS "Result";

Result:

+--------+
| Result |
+--------+
| Homer  |
+--------+

In this case, the XPath is /user, and so EXTRACTVALUE() returns the CDATA from the user element.

Notice that it didn’t return the iq element or its text. That’s to be expected, because EXTRACTVALUE() only returns the CDATA. It doesn’t return any child elements or any text that they may contain.

Using the text() Expression

It’s the equivalent of getting a match by appending the explicit text() expression:

SELECT EXTRACTVALUE('<type>Cat</type>', '/type/text()') AS "Result";

Result:

+--------+
| Result |
+--------+
| Cat    |
+--------+

Nested Elements

Here’s an example of getting CDATA from a nested element:

SELECT EXTRACTVALUE('<user>Homer<iq>Low</iq></user>', '/user/iq') AS "Result";

Result:

+--------+
| Result |
+--------+
| Low    |
+--------+

Here, we got the CDATA from the iq element, which is nested inside the user element. We achieved this by using /user/iq as the XPath.

No Match

If no such element exists, nothing is returned.

Example:

SELECT EXTRACTVALUE('<type>Cat</type>', '/name') AS "Result";

Result:

+--------+
| Result |
+--------+
|        |
+--------+

This is also the case if you get the nesting wrong.

Example:

SELECT EXTRACTVALUE('<user>Homer<iq>Low</iq></user>', '/iq') AS "Result";

Result:

+--------+
| Result |
+--------+
|        |
+--------+

Here, we used /iq as the XPath when we should have used /user/iq.

Empty XML

An empty XML document will produce an empty result:

SELECT EXTRACTVALUE('', '/name') AS "Result";

Result:

+--------+
| Result |
+--------+
|        |
+--------+

Empty XPath

An empty XPath returns an error:

SELECT EXTRACTVALUE('<type>Cat</type>', '');

Result:

ERROR 1105 (HY000): XPATH syntax error: ''

Null XML

Providing null as the first argument results in an error:

SELECT EXTRACTVALUE(null, '');

Result:

ERROR 1105 (HY000): XPATH syntax error: ''

Null XPath

Providing null as the second argument returns null:

SELECT EXTRACTVALUE('<type>Cat</type>', null) AS "Result";

Result:

+--------+
| Result |
+--------+
| NULL   |
+--------+

Providing Just One Argument

Providing a single argument results in an error:

SELECT EXTRACTVALUE('<type>Cat</type>');

Result:

ERROR 1582 (42000): Incorrect parameter count in the call to native function 'EXTRACTVALUE'

Missing Argument

Calling EXTRACTVALUE() without passing any arguments results in an error:

SELECT EXTRACTVALUE();

Result:

ERROR 1582 (42000): Incorrect parameter count in the call to native function 'EXTRACTVALUE'

Introduction to XML

For more about XML, see my XML Tutorial over at Quackit. That tutorial also includes an Introduction to XPath.