There are several ways to create an index in MongoDB, and from MongoDB 4.2, we can create wildcard indexes.
A wildcard index can be thought of as a kind of filter that automatically matches any field, sub-document or array in a collection and then indexes those matches.
This can be useful if your documents contain unstructured data with different fields in different hierarchies. In such cases, there’s no way to predict what the index should be, because you don’t know what data will be in each document.
Wildcard indexes can be useful with such unstructured data, because they index all scalar values of the field, automatically recursing into any subdocuments or arrays and indexing all scalar fields in the subdocument/array.
Example Collection
Wildcard indexes are not for every collection. You would only create a wildcard index on certain collections with documents that contain unstructured data with different fields in different hierarchies.
Below is an example of a collection called pets
that might be a good candidate for a wildcard index:
{
"_id" : 1,
"name" : "Wag",
"details" : {
"type" : "Dog",
"weight" : 20,
"awards" : {
"Florida Dog Awards" : "Top Dog",
"New York Marathon" : "Fastest Dog",
"Sumo 2020" : "Biggest Dog"
}
}
}
{
"_id" : 2,
"name" : "Fetch",
"details" : {
"born" : ISODate("2020-06-22T14:00:00Z"),
"color" : "Black"
}
}
{
"_id" : 3,
"name" : "Scratch",
"details" : {
"eats" : [
"Mouse Porridge",
"Bird Soup",
"Caviar"
],
"type" : "Cat",
"born" : ISODate("2020-12-19T14:00:00Z")
}
}
Each of the 3 documents in this collection has a details
field, but they contain different fields within that field. It’s not consistent. This would normally make it hard to create an index, because we don’t know what fields are going to be in each document. We would probably need to create multiple indexes, after careful analysis of the possible document structures.
Fortunately we can create a wildcard index.
But first, let’s take a look at what a query plan might look like when querying one of those fields. Imagine we want to find out which dog got the “Fastest Dog” award at the New York Marathon. We could do the following:
db.pets.find( { "details.awards.New York Marathon" : "Fastest Dog" } )
And if we wanted to check the query plan, we could append explain()
to the end:
db.pets.find( { "details.awards.New York Marathon" : "Fastest Dog" } ).explain()
Which returns the following:
{
"queryPlanner" : {
"plannerVersion" : 1,
"namespace" : "PetHotel.pets",
"indexFilterSet" : false,
"parsedQuery" : {
"details.awards.New York Marathon" : {
"$eq" : "Fastest Dog"
}
},
"queryHash" : "EC0D5185",
"planCacheKey" : "EC0D5185",
"winningPlan" : {
"stage" : "COLLSCAN",
"filter" : {
"details.awards.New York Marathon" : {
"$eq" : "Fastest Dog"
}
},
"direction" : "forward"
},
"rejectedPlans" : [ ]
},
"ok" : 1
}
Which tells us that it was going to do a collection scan (COLLSCAN), which means that it has to scan through every document looking for the field.
Create a Wildcard Index
Here’s an example of creating a wildcard index for the above collection.
db.pets.createIndex({ "details.$**": 1 });
Output:
{ "createdCollectionAutomatically" : false, "numIndexesBefore" : 1, "numIndexesAfter" : 2, "ok" : 1 }
That’s it. The wildcard index has been created.
To create the wildcard index we used the field name that we wanted to create the index on (in this case the details
field), then we appended that with a dot (.
), and then the important part, the $**
part.
The $**
specifies that a wildcard index should be created from this field and all its subdocuments.
Prefixing the $**
with details
limits the wildcard index’s scope to just the details
field.
Now let’s re-check the query plan for the aforementioned query:
db.pets.find( { "details.awards.New York Marathon" : "Fastest Dog" } ).explain()
Result:
{
"queryPlanner" : {
"plannerVersion" : 1,
"namespace" : "PetHotel.pets",
"indexFilterSet" : false,
"parsedQuery" : {
"details.awards.New York Marathon" : {
"$eq" : "Fastest Dog"
}
},
"queryHash" : "EC0D5185",
"planCacheKey" : "7DFA23ED",
"winningPlan" : {
"stage" : "FETCH",
"inputStage" : {
"stage" : "IXSCAN",
"keyPattern" : {
"$_path" : 1,
"details.awards.New York Marathon" : 1
},
"indexName" : "details.$**_1",
"isMultiKey" : false,
"multiKeyPaths" : {
"$_path" : [ ],
"details.awards.New York Marathon" : [ ]
},
"isUnique" : false,
"isSparse" : false,
"isPartial" : false,
"indexVersion" : 2,
"direction" : "forward",
"indexBounds" : {
"$_path" : [
"[\"details.awards.New York Marathon\", \"details.awards.New York Marathon\"]"
],
"details.awards.New York Marathon" : [
"[\"Fastest Dog\", \"Fastest Dog\"]"
]
}
}
},
"rejectedPlans" : [ ]
},
"ok" : 1
}
This time the collection scan (COLLSCAN) has been replaced by an index scan (IXSCAN) on our newly created wildcard index.
Each field within our details
field has been indexed as a path/value, and there’s an entry in the index for every field in the hierarchy. Where the field value is a subdocument (such as our. awards
field), the indexing has descended into the subdocument and repeated the process.
Creating a Wildcard Index on All Field Paths
In the previous example, we created a wildcard index on a single field path. It’s possible to create a wildcard index on all fields paths simply by using the $**
without prefixing it with a field.
For example, we could have done this:
db.pets.createIndex({ "$**": 1 });
That would have created a wildcard index on all field paths.
Actually, that’s not quite true. By default, wildcard indexes aren’t created on the _id
field. To include the _id
field, you would need to include it in a wildcardProjection
document.
Can’t Create Wildcard Indexes? Check this Setting.
The mongod
featureCompatibilityVersion
must be at least 4.2
to create wildcard indexes.
You can check this setting with the following code:
db.adminCommand(
{
getParameter: 1,
featureCompatibilityVersion: 1
}
)
You can set it using the setFeatureCompatibilityVersion
command:
db.adminCommand( { setFeatureCompatibilityVersion: "4.4" } )
The setFeatureCompatibilityVersion
command needs to be run in the admin
database.