In MongoDB, the distinct
aggregation command finds the distinct values for a specified field across a single collection.
It returns a document that contains an array of the distinct values, as well as an embedded document with query statistics and the query plan.
Distinct values are those with redundant duplicates removed. Distinct values are unique values. For example, if you have 2 or 3 documents with the same value, the distinct
command will return just one value.
There’s also a db.collection.distinct()
method, which is a shell wrapper method for the distinct
command.
Example
Suppose we have a collection called pets
with the following documents.
{ "_id" : 1, "name" : "Wag", "type" : "Dog", "weight" : 20 } { "_id" : 2, "name" : "Bark", "type" : "Dog", "weight" : 10 } { "_id" : 3, "name" : "Meow", "type" : "Cat", "weight" : 7 } { "_id" : 4, "name" : "Scratch", "type" : "Cat", "weight" : 8 } { "_id" : 5, "name" : "Bruce", "type" : "Bat", "weight" : 3 } { "_id" : 6, "name" : "Fetch", "type" : "Dog", "weight" : 17 } { "_id" : 7, "name" : "Jake", "type" : "Dog", "weight" : 30 }
We can use the distinct
command to return the distinct pet types.
The distinct
command accepts the collection as the first field, and the key as the second. The key is the field for which to return distinct values.
db.runCommand ( { distinct: "pets", key: "type" } )
Result:
{ "values" : [ "Bat", "Cat", "Dog" ], "ok" : 1 }
In this example, even though there are four dogs and two cats in the collection, the array only contains one of each. The distinct
command removed the duplicate values.
The original document has only one bat and so the distinct
command doesn’t change that – there were no duplicate values to dedupe.
Embedded Documents
You can use dot notation to get distinct values from an embedded field
Suppose we have a collection called products
that contains the following documents:
{ "_id" : 1, "product" : { "name" : "Shirt", "color" : "White" }, "sizes" : [ "S", "M", "L" ] } { "_id" : 2, "product" : { "name" : "Shirt", "color" : "Green" }, "sizes" : [ "S", "M", "XL" ] } { "_id" : 3, "product" : { "name" : "Shirt", "color" : "White" }, "sizes" : [ "S", "M", "L" ] } { "_id" : 4, "product" : { "name" : "Shorts", "color" : "Green" }, "sizes" : [ "M", "XS" ] } { "_id" : 5, "product" : { "name" : "Shorts", "color" : "Brown" }, "sizes" : [ "S", "M" ] } { "_id" : 6, "product" : { "name" : "Cap", "color" : "Purple" }, "sizes" : [ "M" ] } { "_id" : 7, "product" : { "name" : "Shoes", "color" : "Brown" }, "sizes" : [ "S", "M", "L" ] } { "_id" : 8, "product" : { "name" : "Shirt", "color" : "White" }, "sizes" : [ "M", "L", "XL" ] } { "_id" : 9, "product" : { "name" : "Cap", "color" : "Green" }, "sizes" : [ "M", "L" ] }
We can use the following query to return distinct values for the product names.
db.runCommand ( { distinct: "products", key: "product.name" } )
Result:
{ "values" : [ "Cap", "Shirt", "Shoes", "Shorts" ], "ok" : 1 }
We can do the same thing for the color
field.
db.runCommand ( { distinct: "products", key: "product.color" } )
Result:
{ "values" : [ "Brown", "Green", "Purple", "White" ], "ok" : 1 }
Get Distinct Values from an Array
Here’s how to use the distinct
command to get the distinct values from the above array.
db.runCommand ( { distinct: "products", key: "sizes" } )
Result:
{ "values" : [ "L", "M", "S", "XL", "XS" ], "ok" : 1 }
Use distinct
with a Query
You can provide a query to specify the documents from which to retrieve the distinct values. To do this, add the query after the key.
Example:
db.runCommand ( {
distinct: "products",
key: "product.name",
query: { "sizes": "S" }
} )
Result:
{ "values" : [ "Shirt", "Shoes", "Shorts" ], "ok" : 1 }
More Information
The distinct
command also accepts other fields, such as comment
, readConcern
, and collation
(which allows you to specify language-specific rules for string comparison, such as rules for letter case and accent marks).
See the MongoDB documentation for more information.