DuckDB’s json_type()
function is a useful utility for inspecting JSON data structures. It helps us determine what type of data we’re working with at any level within a JSON document. This function can be handy when we need to validate data types or handle different JSON structures programmatically.
Syntax
The basic syntax goes like this:
json_type(json[, path])
json
: The JSON data we want to inspectpath
: (Optional) A path to a specific element within the JSON structure
Return Types
The function returns one of eight possible type identifiers:
ARRAY
: JSON arrays like[1, 2, 3]
BIGINT
: Large integersBOOLEAN
: True/false valuesDOUBLE
: Floating-point numbersOBJECT
: JSON objects with key-value pairsUBIGINT
: Unsigned large integersVARCHAR
: String valuesNULL
: Null values
If path
 is a LIST
, the result will be LIST
 of types.
Basic Usage Examples
Let’s start with simple examples to understand how json_type()
identifies different data types:
SELECT
json_type('"Water"') AS r1,
json_type('42.5') AS r2,
json_type('100') AS r3,
json_type('true') AS r4,
json_type('null') AS r5,
json_type('[1, 2, 3, "hello"]') AS r6,
json_type('{"name": "John", "age": 30}') AS r7;
Result:
+---------+--------+---------+---------+------+-------+--------+
| r1 | r2 | r3 | r4 | r5 | r6 | r7 |
+---------+--------+---------+---------+------+-------+--------+
| VARCHAR | DOUBLE | UBIGINT | BOOLEAN | NULL | ARRAY | OBJECT |
+---------+--------+---------+---------+------+-------+--------+
Removing the Single Quotes
The json_type()
function accepts VARCHAR
and JSON
arguments. It’s possible to remove the single quotes from some of the arguments, but not all. For example, this works:
SELECT
json_type(true),
json_type(null);
Output:
+---------------------------------+-----------------+
| json_type(CAST('t' AS BOOLEAN)) | json_type(NULL) |
+---------------------------------+-----------------+
| BOOLEAN | NULL |
+---------------------------------+-----------------+
But this doesn’t:
SELECT json_type(100);
Output:
Binder Error:
No function matches the given name and argument types 'json_type(INTEGER_LITERAL)'. You might need to add explicit type casts.
Candidate functions:
json_type(VARCHAR) -> VARCHAR
json_type(VARCHAR, VARCHAR) -> VARCHAR
json_type(VARCHAR, VARCHAR[]) -> VARCHAR[]
json_type(JSON) -> VARCHAR
json_type(JSON, VARCHAR) -> VARCHAR
json_type(JSON, VARCHAR[]) -> VARCHAR[]
LINE 1: SELECT json_type(100);
^
Also, JSON strings are surrounded in double quotes, so the following fails:
SELECT json_type('Water');
Result:
Invalid Input Error:
Malformed JSON at byte 0 of input: unexpected character. Input: Water
Using Paths to Inspect Nested Elements
The json_type()
function works perfectly when working with complex nested JSON structures. We can specify a path to examine specific elements within the JSON.
We can use json_type()
to get the type of all values at the top level:
WITH sample_data AS (
SELECT '{
"name": "Frida",
"age": 25,
"scores": [95, 87, 92],
"preferences": {
"theme": "dark",
"notifications": true
},
"active": true
}' as json_col
)
SELECT
json_type(json_col, '$.name') as name_type,
json_type(json_col, '$.age') as age_type,
json_type(json_col, '$.scores') as scores_type,
json_type(json_col, '$.preferences') as preferences_type,
json_type(json_col, '$.active') as active_type
FROM sample_data;
Result:
+-----------+----------+-------------+------------------+-------------+
| name_type | age_type | scores_type | preferences_type | active_type |
+-----------+----------+-------------+------------------+-------------+
| VARCHAR | UBIGINT | ARRAY | OBJECT | BOOLEAN |
+-----------+----------+-------------+------------------+-------------+
We can also dig deeper into nested objects:
WITH sample_data AS (
SELECT '{
"name": "Frida",
"age": 25,
"scores": [95, 87, 92],
"preferences": {
"theme": "dark",
"notifications": true
},
"active": true
}' as json_col
)
SELECT
json_type(json_col, '$.scores') as scores_type,
json_type(json_col, '$.scores[0]') as score_0_type,
json_type(json_col, '$.scores[1]') as score_1_type,
json_type(json_col, '$.scores[2]') as score_2_type,
json_type(json_col, '$.preferences') as preferences_type,
json_type(json_col, '$.preferences.theme') as theme_type,
json_type(json_col, '$.preferences.notifications') as theme_type
FROM sample_data;
Result:
+-------------+--------------+--------------+--------------+------------------+------------+------------+
| scores_type | score_0_type | score_1_type | score_2_type | preferences_type | theme_type | theme_type |
+-------------+--------------+--------------+--------------+------------------+------------+------------+
| ARRAY | UBIGINT | UBIGINT | UBIGINT | OBJECT | VARCHAR | BOOLEAN |
+-------------+--------------+--------------+--------------+------------------+------------+------------+
Here’s an array that consists of various types:
WITH array_data AS (
SELECT '[100, "text", true, null, {"key": "value"}]' as json_col
)
SELECT
json_type(json_col, '$[0]') as first_element,
json_type(json_col, '$[1]') as second_element,
json_type(json_col, '$[2]') as third_element,
json_type(json_col, '$[3]') as fourth_element,
json_type(json_col, '$[4]') as fifth_element
FROM array_data;
Result:
+---------------+----------------+---------------+----------------+---------------+
| first_element | second_element | third_element | fourth_element | fifth_element |
+---------------+----------------+---------------+----------------+---------------+
| UBIGINT | VARCHAR | BOOLEAN | NULL | OBJECT |
+---------------+----------------+---------------+----------------+---------------+
Here’s a JSON example with a deeper structure:
WITH nested_data AS (
SELECT '{
"user": {
"profile": {
"personal": {
"name": "Bob",
"contacts": ["[email protected]", "555-1234"]
}
}
}
}' as json_col
)
SELECT
json_type(json_col, '$.user') as user_type,
json_type(json_col, '$.user.profile') as profile_type,
json_type(json_col, '$.user.profile.personal.name') as name_type,
json_type(json_col, '$.user.profile.personal.contacts') as contacts_type,
json_type(json_col, '$.user.profile.personal.contacts[0]') as first_contact_type
FROM nested_data;
Output:
+-----------+--------------+-----------+---------------+--------------------+
| user_type | profile_type | name_type | contacts_type | first_contact_type |
+-----------+--------------+-----------+---------------+--------------------+
| OBJECT | OBJECT | VARCHAR | ARRAY | VARCHAR |
+-----------+--------------+-----------+---------------+--------------------+
Working with Lists of Paths
When we provide a list of paths, json_type()
returns a list of corresponding types.
So if we wanted to put all of the previous example’s JSON types into a list, we could rewrite the query to this:
WITH nested_data AS (
SELECT '{
"user": {
"profile": {
"personal": {
"name": "Bob",
"contacts": ["[email protected]", "555-1234"]
}
}
}
}' as json_col
)
SELECT
json_type(json_col, [
'$.user',
'$.user.profile',
'$.user.profile.personal.name',
'$.user.profile.personal.contacts',
'$.user.profile.personal.contacts[0]'
]) as all_types
FROM nested_data;
Result:
+-------------------------------------------+
| all_types |
+-------------------------------------------+
| [OBJECT, OBJECT, VARCHAR, ARRAY, VARCHAR] |
+-------------------------------------------+
Handling Non-Existent Paths
When we specify a path that doesn’t exist in the JSON structure, json_type()
returns NULL
:
SELECT
json_type('{"a": 1}', '$.oops') AS r1,
json_type('[1, 2, 3]', '$[10]') AS r2;
Output:
+------+------+
| r1 | r2 |
+------+------+
| NULL | NULL |
+------+------+