MongoDB distinct Command

In MongoDB, the distinct aggregation command finds the distinct values for a specified field across a single collection.

It returns a document that contains an array of the distinct values, as well as an embedded document with query statistics and the query plan.

Distinct values are those with redundant duplicates removed. Distinct values are unique values. For example, if you have 2 or 3 documents with the same value, the distinct command will return just one value.

There’s also a db.collection.distinct() method, which is a shell wrapper method for the distinct command.

Example

Suppose we have a collection called pets with the following documents.

{ "_id" : 1, "name" : "Wag", "type" : "Dog", "weight" : 20 }
{ "_id" : 2, "name" : "Bark", "type" : "Dog", "weight" : 10 }
{ "_id" : 3, "name" : "Meow", "type" : "Cat", "weight" : 7 }
{ "_id" : 4, "name" : "Scratch", "type" : "Cat", "weight" : 8 }
{ "_id" : 5, "name" : "Bruce", "type" : "Bat", "weight" : 3 }
{ "_id" : 6, "name" : "Fetch", "type" : "Dog", "weight" : 17 }
{ "_id" : 7, "name" : "Jake", "type" : "Dog", "weight" : 30 }

We can use the distinct command to return the distinct pet types.

The distinct command accepts the collection as the first field, and the key as the second. The key is the field for which to return distinct values.

db.runCommand ( { distinct: "pets", key: "type" } )

Result:

{ "values" : [ "Bat", "Cat", "Dog" ], "ok" : 1 }

In this example, even though there are four dogs and two cats in the collection, the array only contains one of each. The distinct command removed the duplicate values.

The original document has only one bat and so the distinct command doesn’t change that – there were no duplicate values to dedupe.

Embedded Documents

You can use dot notation to get distinct values from an embedded field

Suppose we have a collection called products that contains the following documents:

{ "_id" : 1, "product" : { "name" : "Shirt", "color" : "White" }, "sizes" : [ "S", "M", "L" ] }
{ "_id" : 2, "product" : { "name" : "Shirt", "color" : "Green" }, "sizes" : [ "S", "M", "XL" ] }
{ "_id" : 3, "product" : { "name" : "Shirt", "color" : "White" }, "sizes" : [ "S", "M", "L" ] }
{ "_id" : 4, "product" : { "name" : "Shorts", "color" : "Green" }, "sizes" : [ "M", "XS" ] }
{ "_id" : 5, "product" : { "name" : "Shorts", "color" : "Brown" }, "sizes" : [ "S", "M" ] }
{ "_id" : 6, "product" : { "name" : "Cap", "color" : "Purple" }, "sizes" : [ "M" ] }
{ "_id" : 7, "product" : { "name" : "Shoes", "color" : "Brown" }, "sizes" : [ "S", "M", "L" ] }
{ "_id" : 8, "product" : { "name" : "Shirt", "color" : "White" }, "sizes" : [ "M", "L", "XL" ] }
{ "_id" : 9, "product" : { "name" : "Cap", "color" : "Green" }, "sizes" : [ "M", "L" ] }

We can use the following query to return distinct values for the product names.

db.runCommand ( { distinct: "products", key: "product.name" } )

Result:

{ "values" : [ "Cap", "Shirt", "Shoes", "Shorts" ], "ok" : 1 }

We can do the same thing for the color field.

db.runCommand ( { distinct: "products", key: "product.color" } )

Result:

{ "values" : [ "Brown", "Green", "Purple", "White" ], "ok" : 1 }

Get Distinct Values from an Array

Here’s how to use the distinct command to get the distinct values from the above array.

db.runCommand ( { distinct: "products", key: "sizes" } )

Result:

{ "values" : [ "L", "M", "S", "XL", "XS" ], "ok" : 1 }

Use distinct with a Query

You can provide a query to specify the documents from which to retrieve the distinct values. To do this, add the query after the key.

Example:

db.runCommand ( { 
    distinct: "products", 
    key: "product.name", 
    query: { "sizes": "S" } 
    } )

Result:

{ "values" : [ "Shirt", "Shoes", "Shorts" ], "ok" : 1 }

More Information

The distinct command also accepts other fields, such as comment, readConcern, and collation (which allows you to specify language-specific rules for string comparison, such as rules for letter case and accent marks).

See the MongoDB documentation for more information.