In MariaDB, EXTRACTVALUE()
is a built-in string function that returns CDATA from an XML fragment.
The way it works is that it accepts two string arguments: a fragment of XML markup and an XPath expression (i.e. a locator). EXTRACTVALUE()
then returns the CDATA (i.e. the text) of the first text node which is a child of the element or elements matching the XPath expression.
Syntax
The syntax goes like this:
EXTRACTVALUE(xml_frag, xpath_expr)
Where xml_frag
is the XML fragment, and xpath_expr
is the XPath expression to match.
Example
Here’s a basic example:
SELECT EXTRACTVALUE('<type>Cat</type>', '/type') AS "Result";
Result:
+--------+ | Result | +--------+ | Cat | +--------+
In this case, the XPath is /type
, and so it returns the CDATA (text) from the type
element.
Here’s another one:
SELECT EXTRACTVALUE('<user>Homer<iq>Low</iq></user>', '/user') AS "Result";
Result:
+--------+ | Result | +--------+ | Homer | +--------+
In this case, the XPath is /user
, and so EXTRACTVALUE()
returns the CDATA from the user
element.
Notice that it didn’t return the iq
element or its text. That’s to be expected, because EXTRACTVALUE()
only returns the CDATA. It doesn’t return any child elements or any text that they may contain.
Using the text() Expression
It’s the equivalent of getting a match by appending the explicit text()
expression:
SELECT EXTRACTVALUE('<type>Cat</type>', '/type/text()') AS "Result";
Result:
+--------+ | Result | +--------+ | Cat | +--------+
Nested Elements
Here’s an example of getting CDATA from a nested element:
SELECT EXTRACTVALUE('<user>Homer<iq>Low</iq></user>', '/user/iq') AS "Result";
Result:
+--------+ | Result | +--------+ | Low | +--------+
Here, we got the CDATA from the iq
element, which is nested inside the user
element. We achieved this by using /user/iq
as the XPath.
No Match
If no such element exists, nothing is returned.
Example:
SELECT EXTRACTVALUE('<type>Cat</type>', '/name') AS "Result";
Result:
+--------+ | Result | +--------+ | | +--------+
This is also the case if you get the nesting wrong.
Example:
SELECT EXTRACTVALUE('<user>Homer<iq>Low</iq></user>', '/iq') AS "Result";
Result:
+--------+ | Result | +--------+ | | +--------+
Here, we used /iq
as the XPath when we should have used /user/iq
.
Empty XML
An empty XML document will produce an empty result:
SELECT EXTRACTVALUE('', '/name') AS "Result";
Result:
+--------+ | Result | +--------+ | | +--------+
Empty XPath
An empty XPath returns an error:
SELECT EXTRACTVALUE('<type>Cat</type>', '');
Result:
ERROR 1105 (HY000): XPATH syntax error: ''
Null XML
Providing null
as the first argument results in an error:
SELECT EXTRACTVALUE(null, '');
Result:
ERROR 1105 (HY000): XPATH syntax error: ''
Null XPath
Providing null
as the second argument returns null
:
SELECT EXTRACTVALUE('<type>Cat</type>', null) AS "Result";
Result:
+--------+ | Result | +--------+ | NULL | +--------+
Providing Just One Argument
Providing a single argument results in an error:
SELECT EXTRACTVALUE('<type>Cat</type>');
Result:
ERROR 1582 (42000): Incorrect parameter count in the call to native function 'EXTRACTVALUE'
Missing Argument
Calling EXTRACTVALUE()
without passing any arguments results in an error:
SELECT EXTRACTVALUE();
Result:
ERROR 1582 (42000): Incorrect parameter count in the call to native function 'EXTRACTVALUE'
Introduction to XML
For more about XML, see my XML Tutorial over at Quackit. That tutorial also includes an Introduction to XPath.