When you create a text
index in MongoDB, you have the option of applying different weights to each indexed field.
These weights denote the relative significance of the indexed fields to each other. A field with a higher weight will have more impact in the search results than a field with a lower weight.
This provides you with a certain amount of control over how the search results are calculated.
The default weight is 1, so if you don’t specify a weight for field, it will be assigned a weight of 1.
Example
Suppose we have a collection called posts
, and it contains documents like this:
{ "_id" : 1, "title" : "The Web", "body" : "Body text...", "abstract" : "Abstract text..." }
We could create a compound text
index to the three text fields and apply different weights to each one.
Like this:
db.posts.createIndex(
{
title : "text",
body : "text",
abstract : "text"
},
{
weights: {
body: 10,
abstract: 5
}
}
)
When I created the compound text
index, I specified 3 fields. When I specified the weights, I specified weights for just two of those fields.
The result is that those two fields will be weighted as specified, and the other field (title
) will have the default weight of 1.
We can see this when we run getIndexes()
:
db.posts.getIndexes()
Result:
[ { "v" : 2, "key" : { "_id" : 1 }, "name" : "_id_" }, { "v" : 2, "key" : { "_fts" : "text", "_ftsx" : 1 }, "name" : "title_text_body_text_abstract_text", "weights" : { "abstract" : 5, "body" : 10, "title" : 1 }, "default_language" : "english", "language_override" : "language", "textIndexVersion" : 3 } ]
This means that the body
field will have twice the significance of the abstract
field, and ten times the significance of the title
field.
Wildcard Text Indexes with Weighted Fields
You can apply weights when creating wildcard text indexes. Wildcard text indexes can be handy when you don’t know what the text fields are going to be in the documents. You may know some, but not all.
In such cases, you could create a wildcard text index, and assign a weight to those fields that you are aware of. Any other fields will be assigned the default value of 1.
Suppose we have the following document as a guideline:
{ "_id" : 1, "title" : "Title text...", "body" : "Body text...", "abstract" : "Abstract text...", "tags" : [ "tag1", "tag2", "tag3" ] }
It’s similar to the previous document, except that it now has a tags
field that contains an array. But for all we know, future documents in that collection could have other fields – like maybe categories
, keywords
, author_bio
, etc.
But we don’t actually know, so we will create a wildcard text index that will encapsulate all fields with string data. And we will create weightings for some of the known fields.
Example:
db.posts.createIndex(
{ "$**": "text" },
{ weights: {
body: 10,
abstract: 5
}
}
)
In this case, the body
field gets a weight of 10
and the abstract
field gets a weight of 5
. This means that the body
field has twice the impact of the abstract field, and ten times the impact of all other text fields (because they will be assigned the default weight of 1).
After creating that index, if we call getIndexes()
, we can see the weightings given to the fields:
db.posts.getIndexes()
Result:
[ { "v" : 2, "key" : { "_id" : 1 }, "name" : "_id_" }, { "v" : 2, "key" : { "_fts" : "text", "_ftsx" : 1 }, "name" : "$**_text", "weights" : { "$**" : 1, "abstract" : 5, "body" : 10 }, "default_language" : "english", "language_override" : "language", "textIndexVersion" : 3 } ]
As expected, the body
field gets 10
, the abstract
field gets 5
, and all others get 1
.