When using MySQL, you can use the AVG()
function to calculate the average value from a range of values.
For example, you can use this function to find out what the average city population is for a given country or state. Given a country will have many cities, each with different populations, you can find out what the average is between them. One city might have a population of say, 50,000 while another has a population of 500,000. The AVG()
function will calculate the average for you.
Syntax
The syntax of AVG()
goes like this:
AVG([DISTINCT] expr) [over_clause]
Where expr
is the expression for which you want the average.
The over_clause
is an optional clause that works with window functions. The way window functions work is that, for each row from a query, perform a calculation using rows related to that row. In this case, the over_clause
specifies how to partition query rows into groups for processing by the window function. Note that the over_clause
can only be used if you don’t use the DISTINCT
keyword.
Raw Data
First, here’s the raw data that we’ll use for the examples on this page:
SELECT Name, District, Population FROM City WHERE CountryCode = 'AUS';
Result:
+---------------+-----------------+------------+ | Name | District | Population | +---------------+-----------------+------------+ | Sydney | New South Wales | 3276207 | | Melbourne | Victoria | 2865329 | | Brisbane | Queensland | 1291117 | | Perth | West Australia | 1096829 | | Adelaide | South Australia | 978100 | | Canberra | Capital Region | 322723 | | Gold Coast | Queensland | 311932 | | Newcastle | New South Wales | 270324 | | Central Coast | New South Wales | 227657 | | Wollongong | New South Wales | 219761 | | Hobart | Tasmania | 126118 | | Geelong | Victoria | 125382 | | Townsville | Queensland | 109914 | | Cairns | Queensland | 92273 | +---------------+-----------------+------------+
Basic Example
As you can see from the raw data above, some “Districts” have more than one city, and each city has a different population. We can use the AVG()
function to find the average population of the cities within each district:
SELECT District, AVG(Population) AS 'Average Population' FROM City WHERE CountryCode = 'AUS' AND District = 'New South Wales';
Result:
+-----------------+--------------------+ | District | Average Population | +-----------------+--------------------+ | New South Wales | 998487.2500 | +-----------------+--------------------+
So we can see that New South Wales has an average city population of 998487.2500.
Rounding the Average
You probably noticed that the result from the previous example was quite precise – it returned a result to four decimal places. In our case, this is probably a bit too much. We don’t need a result to the fourth decimal place. Therefore, we can use the ROUND()
function to round it to the nearest whole number (or to a specified number of decimal places if we so desire).
Here’s an example of rounding the result to the nearest whole number:
SELECT District, ROUND(AVG(Population)) AS 'Average Population' FROM City WHERE CountryCode = 'AUS' AND District = 'New South Wales';
Result:
+-----------------+--------------------+ | District | Average Population | +-----------------+--------------------+ | New South Wales | 998487 | +-----------------+--------------------+
By default, the ROUND()
function rounds to the nearest whole number. Alternatively, we could have provided a 2nd argument to specify a number of decimal places, but in this case, we don’t want that.
The GROUP BY Clause
We can use the GROUP BY
clause to list out each district/state, along with their average city populations:
SELECT District, ROUND(AVG(Population)) AS 'Average Population' FROM City WHERE CountryCode = 'AUS' GROUP BY District;
Result:
+-----------------+--------------------+ | District | Average Population | +-----------------+--------------------+ | New South Wales | 998487 | | Victoria | 1495356 | | Queensland | 451309 | | West Australia | 1096829 | | South Australia | 978100 | | Capital Region | 322723 | | Tasmania | 126118 | +-----------------+--------------------+
The ORDER BY Clause
We can also use the ORDER BY
clause to specify a column with which to order by:
SELECT District, ROUND(AVG(Population)) AS 'Average Population' FROM City WHERE CountryCode = 'AUS' GROUP BY District ORDER BY `Average Population`;
Result:
+-----------------+--------------------+ | District | Average Population | +-----------------+--------------------+ | Tasmania | 126118 | | Capital Region | 322723 | | Queensland | 451309 | | South Australia | 978100 | | New South Wales | 998487 | | West Australia | 1096829 | | Victoria | 1495356 | +-----------------+--------------------+
Note that, when ordering by a multi-word alias (like `Average Population`
), you need to use the backtick character (`
) instead of the apostrophe ('
) to surround the two words.
Find the Average Character Length
The AVG()
function isn’t limited to just columns with numerical data. You can also combine AVG()
with other functions to find averages in other areas.
For example, using our sample data, we can find the average character length of all the values in the City
column:
SELECT ROUND(AVG(CHAR_LENGTH(Name))) AS 'Average Character Length' FROM city WHERE CountryCode = 'AUS' AND District = 'New South Wales';
Result:
+--------------------------+ | Average Character Length | +--------------------------+ | 10 | +--------------------------+
Using an OVER Clause
As mentioned, the syntax allows for an OVER
clause to be included in your queries. This can get slightly complex, but basically, the OVER
clause allows you to specify how to partition query rows into groups for processing by the window function.
Here’s an example:
SELECT District, Name AS City, Population, ROUND(AVG(Population) OVER(PARTITION BY District)) AS 'District Avg' FROM City WHERE CountryCode = 'AUS' ORDER BY District;
Result:
+-----------------+---------------+------------+--------------+ | District | City | Population | District Avg | +-----------------+---------------+------------+--------------+ | Capital Region | Canberra | 322723 | 322723 | | New South Wales | Sydney | 3276207 | 998487 | | New South Wales | Wollongong | 219761 | 998487 | | New South Wales | Newcastle | 270324 | 998487 | | New South Wales | Central Coast | 227657 | 998487 | | Queensland | Townsville | 109914 | 451309 | | Queensland | Brisbane | 1291117 | 451309 | | Queensland | Cairns | 92273 | 451309 | | Queensland | Gold Coast | 311932 | 451309 | | South Australia | Adelaide | 978100 | 978100 | | Tasmania | Hobart | 126118 | 126118 | | Victoria | Melbourne | 2865329 | 1495356 | | Victoria | Geelong | 125382 | 1495356 | | West Australia | Perth | 1096829 | 1096829 | +-----------------+---------------+------------+--------------+
This example partitions the rows by District
, providing the average for each partition (district). This allows you to see more granular data, such as the city population, along with the average population for the district that it belongs in.