Understanding the AVG() Function in SQL Server

Like most other RDBMSs, SQL Server has an AVG() function that returns the average of the values in a group.

Below are examples that demonstrate how the AVG() function works in SQL Server.

Syntax

First, here’s a look at the syntax:

AVG ( [ ALL | DISTINCT ] expression )  
   [ OVER ( [ partition_by_clause ] order_by_clause ) ]

So the only required argument is the expression for which we want the average. The others are optional.

Example

Suppose we have the following table:

SELECT 
    ProductId,
    VendorId,
    ProductName,
    ProductPrice
FROM Products;

Result:

ProductId  VendorId  ProductName                      ProductPrice
---------  --------  -------------------------------  ------------
1          1001      Left handed screwdriver          25.99       
2          1001      Right handed screwdriver         25.99       
3          1001      Long Weight (blue)               14.75       
4          1001      Long Weight (green)              11.99       
5          1002      Sledge Hammer                    33.49       
6          1003      Chainsaw                         245         
7          1003      Straw Dog Box                    55.99       
8          1004      Bottomless Coffee Mugs (4 Pack)  9.99        
9          1004      Tea Pot                          12.45       
10         1003      Hammock                          10          

We can use the AVG() function to find out the average price:

SELECT 
    AVG(ProductPrice)
FROM Products;

Result:

44.564

In this example, we got the average price across all products.

We can narrow it down if we like. For example, we can use a WHERE clause to narrow the results to just a given vendor.

Let’s get the average price from a specific vendor:

SELECT 
    AVG(ProductPrice)
FROM Products
WHERE VendorId = 1001;

Result:

19.68

This time we can see that the average price was calculated for just the vendor in question.

The DISTINCT Clause

We can use the DISTINCT clause to calculate the average from only distinct values. That is, we can use it to eliminate redundant duplicates from the calculation.

In our case, two products share the same price. We can use the DISTINCT clause to treat those as one product for the purpose of calculating the average price.

Example:

SELECT 
    AVG(ProductPrice) AS "Average",
    AVG(DISTINCT ProductPrice) AS "Distinct Average"
FROM Products;

Result:

Average  Distinct Average
-------  ----------------
44.564   46.627777       

We can see that there’s a slight discrepancy between the two values returned. The Average column took all values into account when calculating its result. This includes two instances of 25.99 for the first two products in the table.

The Distinct Average column on the other hand, calculated its value as though there was only one instance of 25.99.

The GROUP BY Clause

We can use the GROUP BY clause to get the average value within each group:

SELECT
    VendorId,
    AVG(ProductPrice) AS "Average Price for this Vendor"
FROM Products
GROUP BY VendorId;

Result:

VendorId  Average Price for this Vendor
--------  -----------------------------
1001      19.68                        
1002      33.49                        
1003      103.663333                   
1004      11.225                       

Here we got the average price for each vendor. We did that by grouping the results by the VendorId column.

We can also apply the DISTINCT clause to these results if we so desire:

SELECT
    VendorId,
    AVG(DISTINCT ProductPrice) AS "Average Distinct Price for this Vendor"
FROM Products
GROUP BY VendorId;

Result:

VendorId  Average Distinct Price for this Vendor
--------  --------------------------------------
1001      17.576666                             
1002      33.49                                 
1003      103.663333                            
1004      11.225                                

Only the first vendor’s value changed. That’s because only the first vendor has duplicate prices. The products for all other vendors have unique prices.

The OVER Clause

We can incorporate the OVER clause into our query in order to use AVG() as a window function.

Here’s an example:

SELECT 
    VendorId,
    ProductName,
    ProductPrice,
    AVG(ProductPrice) OVER (PARTITION BY VendorId) 
        AS "Average Price for this Vendor"
FROM Products
ORDER BY VendorId;

Result:

VendorId  ProductName                      ProductPrice  Average Price for this Vendor
--------  -------------------------------  ------------  -----------------------------
1001      Left handed screwdriver          25.99         19.68                        
1001      Right handed screwdriver         25.99         19.68                        
1001      Long Weight (blue)               14.75         19.68                        
1001      Long Weight (green)              11.99         19.68                        
1002      Sledge Hammer                    33.49         33.49                        
1003      Chainsaw                         245           103.663333                   
1003      Straw Dog Box                    55.99         103.663333                   
1003      Hammock                          10            103.663333                   
1004      Bottomless Coffee Mugs (4 Pack)  10            11.225                       
1004      Tea Pot                          12.45         11.225                       

In this case, the products are listed along with their prices, and also the average price for that vendor.

We can use the ORDER BY clause to manipulate the average price column. In SQL Server, AVG() is a deterministic function when used without the OVER and ORDER BY clauses. It is nondeterministic when specified with the OVER and ORDER BY clauses.

Here’s what happens if we sort by price in ascending order within the OVER clause:

SELECT 
    VendorId,
    ProductName,
    ProductPrice,
    AVG(ProductPrice) OVER (PARTITION BY VendorId ORDER BY ProductPrice) 
        AS "Average Price for this Vendor"
FROM Products
ORDER BY VendorId;

Result:

VendorId  ProductName                      ProductPrice  Average Price for this Vendor
--------  -------------------------------  ------------  -----------------------------
1001      Long Weight (green)              11.99         11.99                        
1001      Long Weight (blue)               14.75         13.37                        
1001      Left handed screwdriver          25.99         19.68                        
1001      Right handed screwdriver         25.99         19.68                        
1002      Sledge Hammer                    33.49         33.49                        
1003      Hammock                          10            10                           
1003      Straw Dog Box                    55.99         32.995                       
1003      Chainsaw                         245           103.663333                   
1004      Bottomless Coffee Mugs (4 Pack)  10            10                           
1004      Tea Pot                          12.45         11.225                       

This time the average price was calculated as a “running average”. In other words, the average price was calculated based on the rows processed so far (within each vendor). So as each row is returned, its average price reflects all rows returned up until that point. This has the effect of changing the average price as the results are processed.

Let’s sort it in descending order:

SELECT 
    VendorId,
    ProductName,
    ProductPrice,
    AVG(ProductPrice) OVER (PARTITION BY VendorId ORDER BY ProductPrice DESC) 
        AS "Average Price for this Vendor"
FROM Products
ORDER BY VendorId;

Result:

VendorId  ProductName                      ProductPrice  Average Price for this Vendor
--------  -------------------------------  ------------  -----------------------------
1001      Left handed screwdriver          25.99         25.99                        
1001      Right handed screwdriver         25.99         25.99                        
1001      Long Weight (blue)               14.75         22.243333                    
1001      Long Weight (green)              11.99         19.68                        
1002      Sledge Hammer                    33.49         33.49                        
1003      Chainsaw                         245           245                          
1003      Straw Dog Box                    55.99         150.495                      
1003      Hammock                          10            103.663333                   
1004      Tea Pot                          12.45         12.45                        
1004      Bottomless Coffee Mugs (4 Pack)  10            11.225                       

This time the average prices (or the “running averages”) are different, due to the window function being sorted in a different order.

NULL Values

SQL Server’s AVG() function ignores NULL values. Therefore, NULL values aren’t included in the calculation of the average.