Create a Wildcard Text Index in MongoDB

MongoDB provides us with the ability to create wildcard text indexes.

Wildcard text indexes are similar to wildcard indexes, except that wildcard text indexes support the $text operator, whereas wildcard indexes don’t.

That said, creating each index type is very similar in the sense that they both share the wildcard $** field pattern.

Example

Suppose we have a collection called posts, and it contains documents that look like this:

{
	"_id" : 1,
	"title" : "Title text...",
	"body" : "Body text...",
	"abstract" : "Abstract text...",
	"tags" : [
		"tag1",
		"tag2",
		"tag3"
	]
}

We could create a wildcard text index on that collection like this:

db.posts.createIndex( { "$**": "text" } )

Output:

{
	"createdCollectionAutomatically" : false,
	"numIndexesBefore" : 1,
	"numIndexesAfter" : 2,
	"ok" : 1
}

That uses the wildcard $** field pattern to create an index on all text fields. When you create an index like this, MongoDB indexes every field that contains string data for each document in the collection.

Doing this can be useful if the collection contains lots of unstructured content, and there’s no consistency of text fields in the documents. In such cases, you wouldn’t be able to explicitly include the fields in the index because you wouldn’t know which fields are going to be in the documents.

Weighted Fields

You can use the weights parameter to assign different weights to the fields in a wildcard text index.

Example:

db.posts.createIndex( 
  { "$**": "text" },
  { weights: {
      body: 10,
      abstract: 5
    } 
  } 
)

Output:

{
	"createdCollectionAutomatically" : false,
	"numIndexesBefore" : 1,
	"numIndexesAfter" : 2,
	"ok" : 1
}

In this case, the body field gets a weight of 10 and the abstract field gets a weight of 5. This means that the body field has twice the impact of the abstract field, and ten times the impact of all other text fields (because they will be assigned the default weight of 1).

After creating that index, if we call getIndexes() to return all indexes for the collection, we can see the weightings given to the fields:

db.posts.getIndexes()

Result:

[
	{
		"v" : 2,
		"key" : {
			"_id" : 1
		},
		"name" : "_id_"
	},
	{
		"v" : 2,
		"key" : {
			"_fts" : "text",
			"_ftsx" : 1
		},
		"name" : "$**_text",
		"weights" : {
			"$**" : 1,
			"abstract" : 5,
			"body" : 10
		},
		"default_language" : "english",
		"language_override" : "language",
		"textIndexVersion" : 3
	}
]

As expected, the body field gets 10, the abstract field gets 5, and all others get 1.