How to Create a Text Index in MongoDB

There are various types of indexes that you can create in MongoDB. If you have a field that contains a string or an array of strings, you can use a text index on that field.

To create a text index, use the string literal "text" as the value when creating it.

Create a Text Index on a Single Field

Suppose we have a collection called posts, and it contains documents like this:

{
	"_id" : 1,
	"title" : "The Web",
	"body" : "Body text...",
	"abstract" : "Abstract text..."
}

We might want to create a text index on the body field, or the abstract field, or even both.

Here’s how to create a text index on the body field:

db.posts.createIndex( { body : "text" } )

Output:

{
	"createdCollectionAutomatically" : false,
	"numIndexesBefore" : 1,
	"numIndexesAfter" : 2,
	"ok" : 1
}

We can now use the getIndexes() method to view the index:

db.posts.getIndexes()

Result:

[
	{
		"v" : 2,
		"key" : {
			"_id" : 1
		},
		"name" : "_id_"
	},
	{
		"v" : 2,
		"key" : {
			"_fts" : "text",
			"_ftsx" : 1
		},
		"name" : "body_text",
		"weights" : {
			"body" : 1
		},
		"default_language" : "english",
		"language_override" : "language",
		"textIndexVersion" : 3
	}
]

We can see that there are two indexes. The first one is the default _id index that is created automatically with the collection. The second index is the one we just created.

MongoDB has automatically assigned a name to our newly created index. It’s called body_text.

Create a Compound Text Index

A collection can only have one text index, but it can be a compound index if required.

Let’s create a compound index that includes the body field and the abstract field.

As mentioned, a collection can only have one text index, so let’s drop the index we just created:

db.posts.dropIndex("body_text")

Output:

{ "nIndexesWas" : 2, "ok" : 1 }

OK, now that we’ve dropped the text index, let’s go ahead and create another one – this time it will be a compound index:

db.posts.createIndex( { 
  body : "text",
  abstract : "text"
} )

Output:

{
	"createdCollectionAutomatically" : false,
	"numIndexesBefore" : 1,
	"numIndexesAfter" : 2,
	"ok" : 1
}

That’s a confirmation message that tells us that there used to be 1 index but now there are 2.

Let’s check the list of indexes again:

db.posts.getIndexes()

Result:

[
	{
		"v" : 2,
		"key" : {
			"_id" : 1
		},
		"name" : "_id_"
	},
	{
		"v" : 2,
		"key" : {
			"_fts" : "text",
			"_ftsx" : 1
		},
		"name" : "body_text_abstract_text",
		"weights" : {
			"abstract" : 1,
			"body" : 1
		},
		"default_language" : "english",
		"language_override" : "language",
		"textIndexVersion" : 3
	}
]

Note that compound text indexes have the following restrictions:

  • A compound text index cannot include any other special index types, such as multi-key or geospatial index fields.
  • If the compound text index includes keys preceding the text index key, to perform a $text search, the query predicate must include equality match conditions on the preceding keys.
  • When creating a compound text index, all text index keys must be listed adjacently in the index specification document.

Create a Wildcard Text Index

You can create a wildcard text index by using the wildcard $** field pattern.

Let’s drop the previous index and create a wildcard text index:

db.posts.dropIndex("body_text_abstract_text")
db.posts.createIndex( { "$**" : "text" } )

MongoDB also provides us with the ability to create wildcard indexes, however wildcard text indexes and wildcard indexes are two distinct things.

In particular, wildcard text indexes support the $text operator, whereas wildcard indexes don’t.

The weights Parameter

When creating text indexes, you have the option of specifying a weight on one or more fields. By default, each field is given a weight of 1. But you can change this in order to give fields more or less weighting in the search results.

Example

db.posts.dropIndex("$**_text")
db.posts.createIndex( 
  { 
    title : "text",
    body : "text",
    abstract : "text"
  },
  {
    weights: {
      body: 10,
      abstract: 5
    } 
  } 
)

I started off by dropping the previous index.

When I created the new text index, I specified 3 fields. When I specified the weights, I specified weights for just two of those fields.

The result is that those two fields will be weighted as specified, and the other field (title) will have the default weight of 1.

We can see this when we run getIndexes() again:

db.posts.getIndexes()

Result:

[
	{
		"v" : 2,
		"key" : {
			"_id" : 1
		},
		"name" : "_id_"
	},
	{
		"v" : 2,
		"key" : {
			"_fts" : "text",
			"_ftsx" : 1
		},
		"name" : "title_text_body_text_abstract_text",
		"weights" : {
			"abstract" : 5,
			"body" : 10,
			"title" : 1
		},
		"default_language" : "english",
		"language_override" : "language",
		"textIndexVersion" : 3
	}
]

This means that the body field will have twice the significance of the abstract field, and ten times the significance of the title field.

Creating Multiple Language Text Indexes

You’ll notice that the above text index includes "default_language" : "english" and "language_override" : "language" in its definition.

These fields assist in dealing with documents in multiple languages. The values in the above index are the default values.

When you create a document, you can specify the language of that document by using the language field (or some other field defined in the language_override field of the text index). If such a field doesn’t exist in the document, then it will use the default language specified in the default_language field.

You can specify a default_language (and language_override) when you create the index.

See Create a Multi-Language Text Index in MongoDB for examples of creating text indexes that support multiple languages.