String concatenation is a common operation when running database queries. It simply involves joining two strings together, end to end. DuckDB provides multiple methods for combining strings, each with its own use cases and advantages.
This article explores the various ways to concatenate strings in DuckDB.
Basic String Concatenation
Using the ||
Operator
The concatenation operator (||
) is used to concatenate the string to its left with the string on its right. In other words, we put ||
in between the two strings to concatenate.
Suppose we have the following table:
-- Create a sample table
CREATE TABLE employees (
id INTEGER,
first_name VARCHAR,
last_name VARCHAR,
department VARCHAR,
email VARCHAR
);
-- Insert sample data
INSERT INTO employees VALUES
(1, 'Hector', 'Ajit', 'Engineering', NULL),
(2, 'Maria', 'Swoosh', 'Marketing', '[email protected]'),
(3, 'Muhammad', NULL, 'Sales', '[email protected]'),
(4, 'Sarah', 'Jensen', NULL, '[email protected]'),
(5, NULL, 'Brattley', 'HR', '[email protected]');
A common thing to do would be to concatenate the first name and last names:
SELECT
first_name || ' ' || last_name AS full_name
FROM employees;
Result:
+--------------+
| full_name |
+--------------+
| Hector Ajit |
| Maria Swoosh |
| null |
| Sarah Jensen |
| null |
+--------------+
In this case I included a space as a string literal. Without this space, the first and last names would be pushed up against each other, without a space in between.
We can include other string literals if required:
SELECT
'Employee: ' || first_name || ' ' || last_name AS employee_info
FROM employees;
Result:
+------------------------+
| employee_info |
+------------------------+
| Employee: Hector Ajit |
| Employee: Maria Swoosh |
| null |
| Employee: Sarah Jensen |
| null |
+------------------------+
Using the CONCAT()
Function
The concat()
function is another option for concatenating strings in DuckDB:
SELECT
CONCAT(first_name, ' ', last_name) AS full_name
FROM employees;
Result:
+--------------+
| full_name |
+--------------+
| Hector Ajit |
| Maria Swoosh |
| Muhammad |
| Sarah Jensen |
| Brattley |
+--------------+
Notice that there’s a difference in how the NULL values were handled when compared with the ||
operator example above.
Here’s an example that concatenates more values:
SELECT
CONCAT('Dept: ', department, ' - ', first_name, ' ', last_name) AS employee_info
FROM employees;
Result:
+---------------------------------+
| employee_info |
+---------------------------------+
| Dept: Engineering - Hector Ajit |
| Dept: Marketing - Maria Swoosh |
| Dept: Sales - Muhammad |
| Dept: - Sarah Jensen |
| Dept: HR - Brattley |
+---------------------------------+
Handling NULL Values
As alluded to, there’s a subtle difference between how the ||
operator and the concat()
function handle NULL values. Let’s look at this in more detail.
NULL Behavior with the ||
Operator
When we used the ||
operator above, any time we tried to concatenate a NULL value with another value, the result was NULL. Here it is again:
SELECT
first_name,
last_name,
first_name || ' ' || last_name AS full_name
FROM employees;
Result:
+------------+-----------+--------------+
| first_name | last_name | full_name |
+------------+-----------+--------------+
| Hector | Ajit | Hector Ajit |
| Maria | Swoosh | Maria Swoosh |
| Muhammad | null | null |
| Sarah | Jensen | Sarah Jensen |
| null | Brattley | null |
+------------+-----------+--------------+
We can see that the third and fifth rows are NULL in the full_name column, even though these employees do actually have either a first name or last name (but not both).
Here’s one way to deal with this issue:
SELECT
COALESCE(first_name || ' ' || last_name, 'Unknown') AS handled_null
FROM employees;
Result:
+--------------+
| handled_null |
+--------------+
| Hector Ajit |
| Maria Swoosh |
| Unknown |
| Sarah Jensen |
| Unknown |
+--------------+
In this case we used the coalesce()
function to replace NULL with the string Unknown
. This function returns the first non-NULL argument, and so Unknown
is only returned whenever there’s a NULL value, otherwise the value itself is returned.
Here’s another way to deal with the issue:
SELECT
COALESCE(first_name, '') || ' ' || COALESCE(last_name, '') AS safe_full_name
FROM employees;
Result:
+----------------+
| safe_full_name |
+----------------+
| Hector Ajit |
| Maria Swoosh |
| Muhammad |
| Sarah Jensen |
| Brattley |
+----------------+
This time the result is the same as if we’d used the concat()
function. When there’s a non-NULL value, that value is returned, even if it’s being concatenated with a NULL value. Therefore in this case, the third employee’s first name becomes the full name, and the fifth employee’s last name becomes the full name.
NULL Behavior with CONCAT()
When we use the concat()
function to concatenate values, any NULLs are ignored, and non-NULL values are returned unchanged (or concatenated with another non-NULL value if there are multiple concatenations).
Here’s a comparison of using the concat()
function vs the concatenation operator:
SELECT
CONCAT(first_name, ' ', last_name) AS concat_version,
first_name || ' ' || last_name AS operator_version
FROM employees
WHERE first_name IS NULL OR last_name IS NULL;
Output:
+----------------+------------------+
| concat_version | operator_version |
+----------------+------------------+
| Brattley | null |
| Muhammad | null |
+----------------+------------------+
With concat()
, any non-NULL values are returned, while with the concatenation operator, a single NULL value results in NULL being returned, even when the other value is not NULL.
However, as we saw in the previous example, we can use the coalesce()
function with the concatenation operator in order to produce a result that’s more like what concat()
would produce.
Using the CONCAT_WS()
Function
The concat_ws()
function can be handy whenever you need to concatenate multiple values with a separator in between. The way it works is that we pass the separator as the first argument, then all subsequent arguments will be concatenated using that separator.
For example, to use a hyphen as the separator:
SELECT
CONCAT_WS(' - ', department, first_name, last_name) AS employee_info
FROM employees;
Output:
+-----------------------------+
| employee_info |
+-----------------------------+
| Engineering - Hector - Ajit |
| Marketing - Maria - Swoosh |
| Sales - Muhammad |
| Sarah - Jensen |
| HR - Brattley |
+-----------------------------+
Here’s one that creates a comma separated list:
SELECT
CONCAT_WS(',', id::VARCHAR, first_name, last_name, department) AS csv_row
FROM employees;
Output:
+---------------------------+
| csv_row |
+---------------------------+
| 1,Hector,Ajit,Engineering |
| 2,Maria,Swoosh,Marketing |
| 3,Muhammad,Sales |
| 4,Sarah,Jensen |
| 5,Brattley,HR |
+---------------------------+
Some More Examples
Building Email Addresses
Here’s an example of using concatenation to build email addresses based on the employees’ full names:
SELECT
first_name,
last_name,
LOWER(
CONCAT(
COALESCE(first_name, ''),
'.',
COALESCE(last_name, ''),
'@company.com'
)
) AS generated_email
FROM employees;
Result:
+------------+-----------+--------------------------+
| first_name | last_name | generated_email |
+------------+-----------+--------------------------+
| Hector | Ajit | [email protected] |
| Maria | Swoosh | [email protected] |
| Muhammad | null | [email protected] |
| Sarah | Jensen | [email protected] |
| null | Brattley | [email protected] |
+------------+-----------+--------------------------+
Creating Full Names with Titles
SELECT
CASE
WHEN department = 'Engineering' THEN 'Eng.'
WHEN department = 'Sales' THEN 'Sales Rep.'
ELSE 'Staff'
END || ' ' || COALESCE(first_name, '') || ' ' || COALESCE(last_name, '') AS titled_name
FROM employees;
Result:
+----------------------+
| titled_name |
+----------------------+
| Eng. Hector Ajit |
| Staff Maria Swoosh |
| Sales Rep. Muhammad |
| Staff Sarah Jensen |
| Staff Brattley |
+----------------------+
Building Dynamic SQL
Here’s an example that generates a dynamic SELECT
statement:
-- Create a dynamic WHERE clause
SELECT
'SELECT * FROM employees WHERE ' ||
STRING_AGG(
column_name || ' IS NOT NULL',
' AND '
) AS dynamic_query
FROM information_schema.columns
WHERE table_name = 'employees'
LIMIT 3;
Output:
+----------------------------------------------------------------------------------------------------------------------------------------------------+
| dynamic_query |
+----------------------------------------------------------------------------------------------------------------------------------------------------+
| SELECT * FROM employees WHERE id IS NOT NULL AND first_name IS NOT NULL AND last_name IS NOT NULL AND department IS NOT NULL AND email IS NOT NULL |
+----------------------------------------------------------------------------------------------------------------------------------------------------+
In this case we queried the information schema to return data about the columns in the employees
table.
Performance Considerations
If you’re dealing with large datasets, you’ll likely be mindful of performance. In such cases, you may find that performance changes depending on how you construct the query/concatenation operation.
For example, you might find that the following query:
SELECT
CONCAT(
COALESCE(first_name, ''),
' ',
COALESCE(last_name, ''),
' - ',
COALESCE(department, '')
) AS efficient_concat
FROM employees;
Is more efficient than the following:
SELECT
COALESCE(first_name, '') || ' ' ||
COALESCE(last_name, '') || ' - ' ||
COALESCE(department, '') AS less_efficient
FROM employees;
Tips
Use concat()
when dealing with NULL values:
-- Preferred
SELECT CONCAT(first_name, ' ', last_name) FROM employees;
-- Instead of
SELECT COALESCE(first_name, '') || ' ' || COALESCE(last_name, '') FROM employees;
Use concat_ws()
for consistent separators:
-- Preferred
SELECT CONCAT_WS(', ', first_name, last_name, department) FROM employees;
-- Instead of
SELECT first_name || ', ' || last_name || ', ' || department FROM employees;
Use string_agg()
for group concatenation:
SELECT
department,
STRING_AGG(CONCAT(first_name, ' ', last_name), ', ') as team
FROM employees
GROUP BY department;
Summary
- Use
||
for simple concatenations without NULLs - Use
concat()
when dealing with potential NULL values - Use
concat_ws()
for consistent separator-based concatenation - Use
string_agg()
for group-based concatenation - Always handle type conversions explicitly
- Consider performance implications for large datasets
Remember to test your concatenation operations with NULL values and different data types to ensure robust behavior in your queries.