There are various types of indexes that you can create in MongoDB. If you have a field that contains a string or an array of strings, you can use a text
index on that field.
To create a text
index, use the string literal "text"
as the value when creating it.
Create a Text Index on a Single Field
Suppose we have a collection called posts
, and it contains documents like this:
{ "_id" : 1, "title" : "The Web", "body" : "Body text...", "abstract" : "Abstract text..." }
We might want to create a text
index on the body
field, or the abstract
field, or even both.
Here’s how to create a text
index on the body
field:
db.posts.createIndex( { body : "text" } )
Output:
{ "createdCollectionAutomatically" : false, "numIndexesBefore" : 1, "numIndexesAfter" : 2, "ok" : 1 }
We can now use the getIndexes()
method to view the index:
db.posts.getIndexes()
Result:
[ { "v" : 2, "key" : { "_id" : 1 }, "name" : "_id_" }, { "v" : 2, "key" : { "_fts" : "text", "_ftsx" : 1 }, "name" : "body_text", "weights" : { "body" : 1 }, "default_language" : "english", "language_override" : "language", "textIndexVersion" : 3 } ]
We can see that there are two indexes. The first one is the default _id
index that is created automatically with the collection. The second index is the one we just created.
MongoDB has automatically assigned a name to our newly created index. It’s called body_text
.
Create a Compound Text Index
A collection can only have one text
index, but it can be a compound index if required.
Let’s create a compound index that includes the body
field and the abstract
field.
As mentioned, a collection can only have one text
index, so let’s drop the index we just created:
db.posts.dropIndex("body_text")
Output:
{ "nIndexesWas" : 2, "ok" : 1 }
OK, now that we’ve dropped the text
index, let’s go ahead and create another one – this time it will be a compound index:
db.posts.createIndex( {
body : "text",
abstract : "text"
} )
Output:
{ "createdCollectionAutomatically" : false, "numIndexesBefore" : 1, "numIndexesAfter" : 2, "ok" : 1 }
That’s a confirmation message that tells us that there used to be 1 index but now there are 2.
Let’s check the list of indexes again:
db.posts.getIndexes()
Result:
[ { "v" : 2, "key" : { "_id" : 1 }, "name" : "_id_" }, { "v" : 2, "key" : { "_fts" : "text", "_ftsx" : 1 }, "name" : "body_text_abstract_text", "weights" : { "abstract" : 1, "body" : 1 }, "default_language" : "english", "language_override" : "language", "textIndexVersion" : 3 } ]
Note that compound text
indexes have the following restrictions:
- A compound
text
index cannot include any other special index types, such as multi-key or geospatial index fields. - If the compound
text
index includes keys preceding thetext
index key, to perform a$text
search, the query predicate must include equality match conditions on the preceding keys. - When creating a compound
text
index, alltext
index keys must be listed adjacently in the index specification document.
Create a Wildcard Text Index
You can create a wildcard text index by using the wildcard $**
field pattern.
Let’s drop the previous index and create a wildcard text index:
db.posts.dropIndex("body_text_abstract_text")
db.posts.createIndex( { "$**" : "text" } )
MongoDB also provides us with the ability to create wildcard indexes, however wildcard text indexes and wildcard indexes are two distinct things.
In particular, wildcard text indexes support the $text
operator, whereas wildcard indexes don’t.
The weights
Parameter
When creating text
indexes, you have the option of specifying a weight on one or more fields. By default, each field is given a weight of 1. But you can change this in order to give fields more or less weighting in the search results.
Example
db.posts.dropIndex("$**_text")
db.posts.createIndex(
{
title : "text",
body : "text",
abstract : "text"
},
{
weights: {
body: 10,
abstract: 5
}
}
)
I started off by dropping the previous index.
When I created the new text
index, I specified 3 fields. When I specified the weights, I specified weights for just two of those fields.
The result is that those two fields will be weighted as specified, and the other field (title
) will have the default weight of 1.
We can see this when we run getIndexes()
again:
db.posts.getIndexes()
Result:
[ { "v" : 2, "key" : { "_id" : 1 }, "name" : "_id_" }, { "v" : 2, "key" : { "_fts" : "text", "_ftsx" : 1 }, "name" : "title_text_body_text_abstract_text", "weights" : { "abstract" : 5, "body" : 10, "title" : 1 }, "default_language" : "english", "language_override" : "language", "textIndexVersion" : 3 } ]
This means that the body
field will have twice the significance of the abstract
field, and ten times the significance of the title
field.
Creating Multiple Language Text Indexes
You’ll notice that the above text
index includes "default_language" : "english"
and "language_override" : "language"
in its definition.
These fields assist in dealing with documents in multiple languages. The values in the above index are the default values.
When you create a document, you can specify the language of that document by using the language
field (or some other field defined in the language_override
field of the text
index). If such a field doesn’t exist in the document, then it will use the default language specified in the default_language
field.
You can specify a default_language
(and language_override
) when you create the index.
See Create a Multi-Language Text Index in MongoDB for examples of creating text indexes that support multiple languages.