Filtering a Query Based on JSON Path Existence in DuckDB

Occasionally we might need to filter data based on the existence of a given path in a JSON document in the database. When using DuckDB, we can achieve this outcome with the JSON_EXISTS() function. This function returns true or false, depending on whether the specified path exists. Therefore, we can use it to help us filter our query on that basis.

Sample Data

Let’s create and populate a table for our examples:

-- Create the user_data table
CREATE TABLE user_data (
    id INTEGER PRIMARY KEY,
    username VARCHAR(50) NOT NULL,
    profile JSON,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

-- Insert sample data with varying JSON structures
INSERT INTO user_data (id, username, profile) VALUES
(1, 'flex_jackson', '{
    "name": "Flex Jackson",
    "age": 32,
    "contact": {
        "email": "[email protected]",
        "phone": "555-1234"
    },
    "preferences": {
        "theme": "dark",
        "notifications": true
    },
    "tags": ["developer", "premium"]
}'),
(2, 'kara_flatch', '{
    "name": "Kara Flatch",
    "age": 28,
    "contact": {
        "email": "[email protected]"
    },
    "preferences": {
        "theme": "light"
    },
    "tags": ["designer", "premium", "beta-tester"]
}'),
(3, 'julio_cash', '{
    "name": "Julio Cash",
    "age": 41,
    "preferences": {
        "notifications": false,
        "language": "Spanish"
    },
    "tags": []
}'),
(4, 'emma_wilson', '{
    "name": "Emma Wilson",
    "contact": {
        "phone": "555-5678",
        "address": {
            "city": "New York",
            "zip": "10001"
        }
    }
}'),
(5, 'homer_griffin', '{
    "name": "Homer Griffin",
    "inactive": true,
    "tags": ["guest"]
}');

Examples

Here are some examples of queries that filter the above data on various criteria.

Find Users with an Email Address

This query returns only those users who have provided an email address:

SELECT id, username 
FROM user_data 
WHERE JSON_EXISTS(profile, '$.contact.email');

Result:

+----+--------------+
| id | username |
+----+--------------+
| 1 | flex_jackson |
| 2 | kara_flatch |
+----+--------------+

Find Users with a Phone Number

This query returns only those users who have provided a phone number:

SELECT 
    id, 
    username,
    JSON_EXTRACT_STRING(profile, '$.contact.phone') AS phone 
FROM user_data 
WHERE JSON_EXISTS(profile, '$.contact.phone');

Result:

+----+--------------+----------+
| id | username | phone |
+----+--------------+----------+
| 1 | flex_jackson | 555-1234 |
| 4 | emma_wilson | 555-5678 |
+----+--------------+----------+

In this case I also used JSON_EXTRACT_STRING() to actually present the phone number in the results.

Find Users With Notification Preferences

This query returns only those users who have notification preferences set (regardless of whether the setting is true or false):

SELECT 
    id, 
    username,
    JSON_EXTRACT(profile, '$.preferences.notifications') AS preference
FROM user_data 
WHERE JSON_EXISTS(profile, '$.preferences.notifications');

Result:

+----+--------------+------------+
| id | username | preference |
+----+--------------+------------+
| 1 | flex_jackson | true |
| 3 | julio_cash | false |
+----+--------------+------------+

I used JSON_EXTRACT() in this case, which returns its result in JSON.

Find Users With Any Preference

This query returns users who have any preferences set:

SELECT 
    id, 
    username,
    JSON_EXTRACT(profile, '$.preferences') AS preferences
FROM user_data 
WHERE JSON_EXISTS(profile, '$.preferences');

Result:

+----+--------------+----------------------------------------------+
| id | username | preferences |
+----+--------------+----------------------------------------------+
| 1 | flex_jackson | {"theme":"dark","notifications":true} |
| 2 | kara_flatch | {"theme":"light"} |
| 3 | julio_cash | {"notifications":false,"language":"Spanish"} |
+----+--------------+----------------------------------------------+

Check Users’ Contact Methods

This is a simple query that checks whether a user has a means of contact. It first checks for email, then phone. If no contact method is found, then it returns a message accordingly:

SELECT id, username, 
       CASE 
           WHEN JSON_EXISTS(profile, '$.contact.email') THEN 'Has Email'
           WHEN JSON_EXISTS(profile, '$.contact.phone') THEN 'Has Phone'
           ELSE 'No Contact Info'
       END AS contact_status
FROM user_data;

Result:

+----+---------------+-----------------+
| id | username | contact_status |
+----+---------------+-----------------+
| 1 | flex_jackson | Has Email |
| 2 | kara_flatch | Has Email |
| 3 | julio_cash | No Contact Info |
| 4 | emma_wilson | Has Phone |
| 5 | homer_griffin | No Contact Info |
+----+---------------+-----------------+

The query is quite simple, and only for quick demonstration purposes. It returns one contact method, even if a user has multiple contact methods. But it could be modified to cater for multiple contact methods if required.

Here’s another query that returns the users’ email address for those that have one, and a message if they don’t:

SELECT 
    id, 
    username, 
    CASE 
        WHEN JSON_EXISTS(profile, '$.contact.email') THEN 
            JSON_EXTRACT_STRING(profile, '$.contact.email')
        ELSE 'No email provided'
    END AS email
FROM user_data;

Result:

+----+---------------+-------------------+
| id | username | email |
+----+---------------+-------------------+
| 1 | flex_jackson | [email protected] |
| 2 | kara_flatch | [email protected] |
| 3 | julio_cash | No email provided |
| 4 | emma_wilson | No email provided |
| 5 | homer_griffin | No email provided |
+----+---------------+-------------------+