MongoDB $avg Aggregation Pipeline Operator

In MongoDB, the $avg aggregation pipeline operator calculates and returns the average value of the specified numeric values.

Syntax

The $avg operator supports two syntaxes.

Syntax 1:

{ $avg: <expression> }

Syntax 2:

{ $avg: [ <expression1>, <expression2> ... ]  }

The first syntax accepts one argument and the second syntax accepts multiple arguments.

When used in the $group stage, you can only use the first syntax. In this case, $avg returns the collective average of all the numeric values that result from applying a specified expression to each document in a group of documents that share the same group by key.

Examples of Syntax 1 (Single Argument)

Here are a couple of examples that use syntax 1.

Grouped Documents

This example uses $avg in conjunction with $group to return the average across a group of documents that are grouped by key.

Suppose we have a collection called pets with the following documents:

{ "_id" : 1, "name" : "Wag", "type" : "Dog", "weight" : 20 }
{ "_id" : 2, "name" : "Bark", "type" : "Dog", "weight" : 10 }
{ "_id" : 3, "name" : "Meow", "type" : "Cat", "weight" : 7 }
{ "_id" : 4, "name" : "Scratch", "type" : "Cat", "weight" : 8 }
{ "_id" : 5, "name" : "Bruce", "type" : "Kangaroo", "weight" : 100 }
{ "_id" : 6, "name" : "Hop", "type" : "Kangaroo", "weight" : 130 }
{ "_id" : 7, "name" : "Punch", "type" : "Kangaroo", "weight" : 200 }
{ "_id" : 8, "name" : "Snap", "type" : "Cat", "weight" : 12 }
{ "_id" : 9, "name" : "Ruff", "type" : "Dog", "weight" : 30 }

We can group these documents by their type field, and then use $avg to return the average weight of each group:

db.pets.aggregate(
   [
     {
       $group:
          {
            _id: "$type",
            average_weight: { $avg: "$weight" }
          }
     }
   ]
)

Result:

{ "_id" : "Dog", "average_weight" : 20 }
{ "_id" : "Cat", "average_weight" : 9 }
{ "_id" : "Kangaroo", "average_weight" : 143.33333333333334 }

Arrays

This example applies $avg to a single document that contains a field with an array of values.

This option is only available when using the single argument syntax. Arrays are ignored when using the multi-argument syntax (more on this below).

Suppose we have a collection called players with the following documents:

{ "_id" : 1, "player" : "Homer", "scores" : [ 1, 7, 2, 3, 8, 7, 1 ] }
{ "_id" : 2, "player" : "Marge", "scores" : [ 0, 1, 8, 17, 18, 8 ] }
{ "_id" : 3, "player" : "Bart", "scores" : [ 15, 11, 8, 0, 1, 3 ] }
{ "_id" : 4, "player" : "Brian", "scores" : [ 7 ] }
{ "_id" : 5, "player" : "Farnsworth", "scores" : [ ] }
{ "_id" : 6, "player" : "Meg", "scores" : null }
{ "_id" : 7, "player" : "Ron" }

We can apply $avg to the scores field in each document:

db.players.aggregate(
   [
     {
       $project:
          {
            player: 1,
            averageScore: { $avg: "$scores" }
          }
     }
   ]
)

Result:

{ "_id" : 1, "player" : "Homer", "averageScore" : 4.142857142857143 }
{ "_id" : 2, "player" : "Marge", "averageScore" : 8.666666666666666 }
{ "_id" : 3, "player" : "Bart", "averageScore" : 6.333333333333333 }
{ "_id" : 4, "player" : "Brian", "averageScore" : 7 }
{ "_id" : 5, "player" : "Farnsworth", "averageScore" : null }
{ "_id" : 6, "player" : "Meg", "averageScore" : null }
{ "_id" : 7, "player" : "Ron", "averageScore" : null }

In this case, the first four documents returned the average of the various numbers that were in their respective arrays.

In document 4’s case, this was the same as the number, because there was only one number in the array.

Document 5 returned null because we provided an empty array.

Document 6 returned null because we provided null as the argument.

Document 7 returned null because the field didn’t even exist.

Example of Syntax 2 (Multiple Arguments)

The second syntax involves providing $avg with more than one argument. $avg then calculates the average based on all of the supplied arguments.

Suppose we have a collection called data with the following documents:

{ "_id" : 1, "a" : 1, "b" : 2, "c" : 3, "d" : 4 }
{ "_id" : 2, "a" : 1, "b" : 2, "c" : 3, "d" : [ 4 ] }
{ "_id" : 3, "a" : 1, "b" : 2, "c" : 3, "d" : "Hey" }
{ "_id" : 4, "a" : "One", "b" : "Two", "c" : "Three", "d" : "Four" }

We can use $avg to return the average of the a, b, c, and d fields of each document:

db.data.aggregate(
   [
     {
       $project:
          {
            avg: { $avg: [ "$a", "$b", "$c", "$d" ] }
          }
     }
   ]
)

Result:

{ "_id" : 1, "avg" : 2.5 }
{ "_id" : 2, "avg" : 2 }
{ "_id" : 3, "avg" : 2 }
{ "_id" : 4, "avg" : null }

Document 1 returns the average of the input values of 1, 2, 3, and 4.

However, the next two documents only returned the average of the input values of 1, 2, and 3. The $avg operator ignored their d fields.

This is because $avg ignores non-numeric values. So in this case it ignored "Hey" in document 3 and calculated the average from the remaining (numeric) fields.

As for document 2, its d field contains an array. As mentioned, the $avg operator ignores arrays when using the multi-argument syntax. More precisely, it treats arrays as non-numerical values when used in this context, and $avg ignores non-numerical values.

If all values are non-numeric, then $avg returns null. We can see this with document 4.

Missing Fields

When using the multi-argument syntax, $avg ignores any missing fields. That is, if you supply a field that doesn’t exist, it ignores it. If none of the fields exist, then it returns null.

Example:

db.data.aggregate(
   [
     {
       $project:
          {
            avg: { $avg: [ "$a", "$b", "$c", "$d", "$e" ] }
          }
     }
   ]
)

Result:

{ "_id" : 1, "avg" : 2.5 }
{ "_id" : 2, "avg" : 2 }
{ "_id" : 3, "avg" : 2 }
{ "_id" : 4, "avg" : null }

In this case I provided an extra field ($e) that doesn’t exist in the documents. $avg calculated the average based on the remaining fields that do exist.

However, here’s what happens when none of the fields exist:

db.data.aggregate(
   [
     {
       $project:
          {
            result: { $avg: [ "$x", "$y", "$z" ] }
          }
     }
   ]
)

Result:

{ "_id" : 1, "result" : null }
{ "_id" : 2, "result" : null }
{ "_id" : 3, "result" : null }
{ "_id" : 4, "result" : null }

The result is null for all documents.

As we saw earlier, when using the single-argument syntax, a missing field results in null.

Example:

db.pets.aggregate(
   [
     {
       $group:
          {
            _id: "$type",
            avg: { $avg: "$oops!" }
          }
     }
   ]
)

Result:

{ "_id" : "Kangaroo", "avg" : null }
{ "_id" : "Cat", "avg" : null }
{ "_id" : "Dog", "avg" : null }

Available Stages

$avg is available in the in the following stages:

  • $group
  • $project
  • $addFields
  • $set
  • $replaceRoot
  • $replaceWith
  • $match stage that includes an $expr expression