MongoDB $setUnion

In MongoDB, the $setUnion aggregation pipeline operator accepts two or more arrays and returns an array containing the elements that appear in any of those input arrays.

$setUnion accepts two or more arguments, all of which can be any valid expression as long as they each resolve to an array. $setUnion treats the arrays as sets.

Example

Suppose we have a collection called data with the following documents:

{ "_id" : 1, "a" : [ 1, 2, 3 ], "b" : [ 1, 2, 3 ] }
{ "_id" : 2, "a" : [ 1, 2, 3 ], "b" : [ 1, 2 ] }
{ "_id" : 3, "a" : [ 1, 2 ], "b" : [ 1, 2, 3 ] }
{ "_id" : 4, "a" : [ 1, 2, 3 ], "b" : [ 3, 4, 5 ] }
{ "_id" : 5, "a" : [ 1, 2, 3 ], "b" : [ 4, 5, 6 ] }

We can apply the $setUnion operator against the a and b fields in those documents.

Example:

db.data.aggregate(
   [
     { $match: { _id: { $in: [ 1, 2, 3, 4, 5 ] } } },
     {
       $project:
          {
            _id: 0,
            a: 1,
            b: 1,
            result: { $setUnion: [ "$a", "$b" ] }
          }
     }
   ]
)

Result:

{ "a" : [ 1, 2, 3 ], "b" : [ 1, 2, 3 ], "result" : [ 1, 2, 3 ] }
{ "a" : [ 1, 2, 3 ], "b" : [ 1, 2 ], "result" : [ 1, 2, 3 ] }
{ "a" : [ 1, 2 ], "b" : [ 1, 2, 3 ], "result" : [ 1, 2, 3 ] }
{ "a" : [ 1, 2, 3 ], "b" : [ 3, 4, 5 ], "result" : [ 1, 2, 3, 4, 5 ] }
{ "a" : [ 1, 2, 3 ], "b" : [ 4, 5, 6 ], "result" : [ 1, 2, 3, 4, 5, 6 ] }

Nested Arrays

The $setUnion operator does not descend into any nested arrays. It only evaluates top-level arrays.

Suppose our collection also contains the following documents:

{ "_id" : 6, "a" : [ 1, 2, 3 ], "b" : [ [ 1, 2, 3 ] ] }
{ "_id" : 7, "a" : [ 1, 2, 3 ], "b" : [ [ 1, 2 ], 3 ] }
{ "_id" : 8, "a" : [ [ 1, 2, 3 ] ], "b" : [ [ 1, 2, 3 ] ] }
{ "_id" : 9, "a" : [ [ 1, 2, 3 ] ], "b" : [ [ 1, 2 ], 3 ] }

And we apply $setUnion to those two documents:

db.data.aggregate(
   [
     { $match: { _id: { $in: [ 6, 7, 8, 9 ] } } },
     {
       $project:
          {
            _id: 0,
            a: 1,
            b: 1,
            result: { $setUnion: [ "$a", "$b" ] }
          }
     }
   ]
)

Result:

{ "a" : [ 1, 2, 3 ], "b" : [ [ 1, 2, 3 ] ], "result" : [ 1, 2, 3, [ 1, 2, 3 ] ] }
{ "a" : [ 1, 2, 3 ], "b" : [ [ 1, 2 ], 3 ], "result" : [ 1, 2, 3, [ 1, 2 ] ] }
{ "a" : [ [ 1, 2, 3 ] ], "b" : [ [ 1, 2, 3 ] ], "result" : [ [ 1, 2, 3 ] ] }
{ "a" : [ [ 1, 2, 3 ] ], "b" : [ [ 1, 2 ], 3 ], "result" : [ 3, [ 1, 2 ], [ 1, 2, 3 ] ] }

Missing Fields

Applying $setUnion to a non-existent field results in null.

Consider the following documents:

{ "_id" : 10, "a" : [ 1, 2, 3 ] }
{ "_id" : 11, "b" : [ 1, 2, 3 ] }
{ "_id" : 12 }

The first document doesn’t have a b field, the second document doesn’t have an a field, and the third document doesn’t have either.

Here’s what happens when we apply $setUnion to the a and b fields:

db.data.aggregate(
   [
     { $match: { _id: { $in: [ 10, 11, 12 ] } } },
     {
       $project:
          {
            _id: 0,
            a: 1,
            b: 1,
            result: { $setUnion: [ "$a", "$b" ] }
          }
     }
   ]
)

Result:

{ "a" : [ 1, 2, 3 ], "result" : null }
{ "b" : [ 1, 2, 3 ], "result" : null }
{ "result" : null }

Wrong Data Type

All operands of $setUnion must be arrays. If they aren’t, an error is thrown.

Suppose our collection contains the following documents:

{ "_id" : 13, "a" : [ 1, 2, 3 ], "b" : 3 }
{ "_id" : 14, "a" : 3, "b" : [ 1, 2, 3 ] }
{ "_id" : 15, "a" : 2, "b" : 3 }

And we apply $setUnion to those documents:

db.data.aggregate(
   [
     { $match: { _id: { $in: [ 13, 14, 15 ] } } },
     {
       $project:
          {
            _id: 0,
            a: 1,
            b: 1,
            result: { $setUnion: [ "$a", "$b" ] }
          }
     }
   ]
)

Result:

Error: command failed: {
	"ok" : 0,
	"errmsg" : "All operands of $setUnion must be arrays. One argument is of type: double",
	"code" : 17043,
	"codeName" : "Location17043"
} : aggregate failed :
_getErrorWithCode@src/mongo/shell/utils.js:25:13
doassert@src/mongo/shell/assert.js:18:14
_assertCommandWorked@src/mongo/shell/assert.js:639:17
assert.commandWorked@src/mongo/shell/assert.js:729:16
DB.prototype._runAggregate@src/mongo/shell/db.js:266:5
DBCollection.prototype.aggregate@src/mongo/shell/collection.js:1058:12
@(shell):1:1

Duplicate Values

The $setUnion operator filters out duplicates in its result to output an array that contain only unique entries. Also, the order of the elements in the output array is unspecified.

Suppose we have the following documents:

{ "_id" : 16, "a" : [ 1, 1, 2, 2, 3, 3 ], "b" : [ 1, 2, 3 ] }
{ "_id" : 17, "a" : [ 1, 1, 2, 2, 3, 3 ], "b" : [ 1, 1, 2 ] }
{ "_id" : 18, "a" : [ 1, 1, 2, 2, 3, 3 ], "b" : [ ] }
{ "_id" : 19, "a" : [ 3, 2, 1, 2, 3, 1 ], "b" : [ 2, 3, 1 ] }
{ "_id" : 20, "a" : [ 1, 3, 2, 2, 3, 1 ], "b" : [ 2, 1 ] }
{ "_id" : 21, "a" : [ 2, 3, 1, 2, 3, 1 ], "b" : [ ] }

Then we apply the $setUnion operator to them:

db.data.aggregate(
   [
     { $match: { _id: { $in: [ 16, 17, 18, 19, 20, 21 ] } } },
     {
       $project:
          {
            _id: 0,
            a: 1,
            b: 1,
            result: { $setUnion: [ "$a", "$b" ] }
          }
     }
   ]
)

Result:

{ "a" : [ 1, 1, 2, 2, 3, 3 ], "b" : [ 1, 2, 3 ], "result" : [ 1, 2, 3 ] }
{ "a" : [ 1, 1, 2, 2, 3, 3 ], "b" : [ 1, 1, 2 ], "result" : [ 1, 2, 3 ] }
{ "a" : [ 1, 1, 2, 2, 3, 3 ], "b" : [ ], "result" : [ 1, 2, 3 ] }
{ "a" : [ 3, 2, 1, 2, 3, 1 ], "b" : [ 2, 3, 1 ], "result" : [ 1, 2, 3 ] }
{ "a" : [ 1, 3, 2, 2, 3, 1 ], "b" : [ 2, 1 ], "result" : [ 1, 2, 3 ] }
{ "a" : [ 2, 3, 1, 2, 3, 1 ], "b" : [ ], "result" : [ 1, 2, 3 ] }

More than Two Arguments

As mentioned, $setUnion accepts two or more arguments. All previous examples used two arguments. Here’s one that uses three arguments.

Suppose we have the following documents:

{ "_id" : 22, "a" : [ 1, 2 ], "b" : [ 1, 2 ], "c" : [ 1, 2 ] }
{ "_id" : 23, "a" : [ 1, 2 ], "b" : [ 1, 2 ], "c" : [ 1, 2, 3 ] }
{ "_id" : 24, "a" : [ 1, 2 ], "b" : [ 1, 2 ], "c" : [ 4, 5 ] }

These documents have an extra field – a c field.

Now let’s apply $setUnion to those three fields:

db.data.aggregate(
   [
     { $match: { _id: { $in: [ 22, 23, 24 ] } } },
     {
       $project:
          {
            _id: 0,
            a: 1,
            b: 1,
            c: 1,
            result: { $setUnion: [ "$a", "$b", "$c" ] }
          }
     }
   ]
)

Result:

{ "a" : [ 1, 2 ], "b" : [ 1, 2 ], "c" : [ 1, 2 ], "result" : [ 1, 2 ] }
{ "a" : [ 1, 2 ], "b" : [ 1, 2 ], "c" : [ 1, 2, 3 ], "result" : [ 1, 2, 3 ] }
{ "a" : [ 1, 2 ], "b" : [ 1, 2 ], "c" : [ 4, 5 ], "result" : [ 1, 2, 4, 5 ] }