MongoDB $bsonSize

From MongoDB 4.4, you can use the $bsonSize aggregation pipeline operator to return the size of a given document in bytes.

$bsonSize accepts any valid expression as long as it resolves to either an object or null

Example

Suppose we have a collection called bars with the following document:

{
	"_id" : 1,
	"name" : "Boardwalk Social",
	"location" : {
		"type" : "Point",
		"coordinates" : [
			-16.919297718553366,
			145.77675259719823
		]
	},
	"categories" : [
		"Bar",
		"Restaurant",
		"Hotel"
	],
	"reviews" : [
		{
			"name" : "Steve",
			"date" : "20 December, 2020",
			"rating" : 5,
			"comments" : "Great vibe."
		},
		{
			"name" : "Lisa",
			"date" : "25 October, 2020",
			"rating" : 3,
			"comments" : "They just raised their prices :("
		},
		{
			"name" : "Kim",
			"date" : "21 October, 2020",
			"rating" : 4,
			"comments" : "Nice for Friday happy hour"
		}
	]
}

We can see that the location field contains a document. And the reviews field contains an array of documents.

Let’s use the $bsonSize operator to check the size of the location field:

db.bars.aggregate([
  {
    $project: {
      "locationSize": { $bsonSize: "$location" }
    }
  }
])

Result:

{ "_id" : 1, "locationSize" : 61 }

In this case, the size of the location field is 61 bytes.

Objects in Arrays

Here’s an example of getting the size of a document that’s an element of an array:

db.bars.aggregate([
  {
    $project: {
      "review": { $arrayElemAt: [ "$reviews", 0 ] },
      "reviewSize": { $bsonSize: { $arrayElemAt: [ "$reviews", 0 ] } }
    }
  }
]).pretty()

Result:

{
	"_id" : 1,
	"review" : {
		"name" : "Steve",
		"date" : "20 December, 2020",
		"rating" : 5,
		"comments" : "Great vibe."
	},
	"reviewSize" : 91
}

In this case, we use $arrayElemAt to return the actual review, and then again to return the size of that review.

MongoDB arrays are zero-based, so the review is the first review.

Get the Size of the Top Level Document

We can use the $$ROOT system variable to refer to the top level document – or root document. This is the document that’s currently being processed by the pipeline.

Therefore, we can pass the $$ROOT variable to $bsonSize to get the size of the whole document that’s currently being processed.

Example:

db.bars.aggregate([
  {
    $project: {
      "rootSize": { $bsonSize: "$$ROOT" }
    }
  }
])

Result:

{ "_id" : 1, "rootSize" : 502 }

In this case, the document is 502 bytes.

Wrong Data Types

As mentioned, $bsonSize accepts any valid expression as long as it resolves to an object or null.

Here’s an example of what happens if you provide an expression that resolves to a different BSON type:

db.bars.aggregate([
  {
    $project: {
      "nameSize": { $bsonSize: "$name" }
    }
  }
])

Result:

Error: command failed: {
	"ok" : 0,
	"errmsg" : "$bsonSize requires a document input, found: string",
	"code" : 31393,
	"codeName" : "Location31393"
} : aggregate failed :
_getErrorWithCode@src/mongo/shell/utils.js:25:13
doassert@src/mongo/shell/assert.js:18:14
_assertCommandWorked@src/mongo/shell/assert.js:618:17
assert.commandWorked@src/mongo/shell/assert.js:708:16
DB.prototype._runAggregate@src/mongo/shell/db.js:266:5
DBCollection.prototype.aggregate@src/mongo/shell/collection.js:1046:12
@(shell):1:1

In this case, we tried to find the size of a string, but that’s not one of the supported BSON types, so we get an error.

However, all is not lost. We can use $binarySize to get the size of a string.

Get the Total Size of All Documents in a Collection

Suppose we have a collection called cats with the following documents:

{ "_id" : 1, "name" : "Scratch", "born" : "March, 2020" }
{ "_id" : 2, "name" : "Meow", "weight" : 30 }
{ "_id" : 3, "name" : "Fluffy", "height" : 15 }
{ "_id" : 4, "name" : "Sox", "weight" : 40 }
{ "_id" : 5, "name" : null, "weight" : 20 }
{ "_id" : 6, "height" : 20, "born" : ISODate("2021-01-03T23:30:15.123Z") }

As previously shown, we can use $$ROOT to return the top level document currently being processed:

db.cats.aggregate([
  {
    $project: {
      "rootSize": { $bsonSize: "$$ROOT" }
    }
  }
])

Result:

{ "_id" : 1, "rootSize" : 58 }
{ "_id" : 2, "rootSize" : 49 }
{ "_id" : 3, "rootSize" : 51 }
{ "_id" : 4, "rootSize" : 48 }
{ "_id" : 5, "rootSize" : 40 }
{ "_id" : 6, "rootSize" : 48 }

But we can also get the total size of all the documents in the collection.

We can achieve this as follows:

db.cats.aggregate([
  {
    $group: {
      "_id": null,
      "rootSize": { $sum: { $bsonSize: "$$ROOT" } }
    }
  }
])

Result:

{ "_id" : null, "rootSize" : 294 }

Here, we grouped the results using the $group operator and providing an _id of null. We could have used any other constant value.

We also used $sum to calculate the combined sizes of the various documents.

We can see that the total size of all documents in the collection is 294, which we can confirm by adding up the results in the previous example.

Object.bsonSize() Method

Another way to get a document’s size is to use the Object.bsonSize() method.