In MySQL, ANY_VALUE()
is an aggregate function that allows us to include nonaggregated columns in the SELECT
list when using the GROUP BY
clause.
The ANY_VALUE()
function is commonly used to resolve issues that arise when a query includes columns that are not part of the GROUP BY
clause or aggregate functions. It can be useful in scenarios where we want to include descriptive columns in a grouped result set without affecting the grouping behaviour.
Syntax
The syntax goes like this:
ANY_VALUE(arg)
Where arg
represents the column or value we want to include in the result set. It can be a column name, a literal value, or an expression involving columns and operators.
Functionality
The primary purpose of the ANY_VALUE()
function is to eliminate the ambiguity that arises when nonaggregated columns are included in a GROUP BY
query.
By default, when the ONLY_FULL_GROUP_BY
SQL mode is enabled, MySQL requires that all nonaggregated columns in the SELECT
list be included in the GROUP BY
clause. However, in some cases, we may want to include additional columns without affecting the grouping behaviour.
When we apply the ANY_VALUE()
function to a nonaggregated column within a GROUP BY
query, MySQL selects an arbitrary value from the group to display in the result set. It effectively informs MySQL that any value from the group is acceptable, resolving the ambiguity and allowing the query to execute successfully.
Example
Suppose we run the following query:
SELECT
District,
Name,
Population
FROM City
WHERE CountryCode = 'AUS';
Result:
+-----------------+---------------+------------+ | District | Name | Population | +-----------------+---------------+------------+ | New South Wales | Sydney | 3276207 | | Victoria | Melbourne | 2865329 | | Queensland | Brisbane | 1291117 | | West Australia | Perth | 1096829 | | South Australia | Adelaide | 978100 | | Capital Region | Canberra | 322723 | | Queensland | Gold Coast | 311932 | | New South Wales | Newcastle | 270324 | | New South Wales | Central Coast | 227657 | | New South Wales | Wollongong | 219761 | | Tasmania | Hobart | 126118 | | Victoria | Geelong | 125382 | | Queensland | Townsville | 109914 | | Queensland | Cairns | 92273 | +-----------------+---------------+------------+
This query returns a list of cities and their populations, along with the district that each city belongs to.
Now suppose we want to group the results by district, and include the population for each district. And we don’t want to include the city names in the grouping, but we would like to have a sample city returned for each district.
We can do the following to achieve that outcome:
SELECT
District,
ANY_VALUE(Name),
SUM(Population) FROM City
WHERE CountryCode = 'AUS'
GROUP BY District;
Result:
+-----------------+-----------------+-----------------+ | District | ANY_VALUE(Name) | SUM(Population) | +-----------------+-----------------+-----------------+ | New South Wales | Sydney | 3993949 | | Victoria | Melbourne | 2990711 | | Queensland | Brisbane | 1805236 | | West Australia | Perth | 1096829 | | South Australia | Adelaide | 978100 | | Capital Region | Canberra | 322723 | | Tasmania | Hobart | 126118 | +-----------------+-----------------+-----------------+
This query does exactly what we want. It returns the total population of each district, and displays a sample city from within each district. The sample cities were generated by the ANY_VALUE()
function.
In the above query, the ANY_VALUE()
function allows us to include the Name
column in the result set without including it in the GROUP BY
clause. We are still able to group the results by the District
column and obtain the total population of each district. The ANY_VALUE()
function ensures that any city name within the given district is displayed in the result set, even though it is not part of the grouping operation.
Here’s what happens when I remove the ANY_VALUE()
function:
SELECT
District,
Name,
SUM(Population) FROM City
WHERE CountryCode = 'AUS'
GROUP BY District;
Result:
ERROR 1055 (42000): Expression #2 of SELECT list is not in GROUP BY clause and contains nonaggregated column 'world.City.Name' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by
I get an error, because my ONLY_FULL_GROUP_BY
SQL mode is enabled. Let’s verify that:
SELECT @@sql_mode;
Result:
+-----------------------------------------------------------------------------------------------------------------------+ | @@sql_mode | +-----------------------------------------------------------------------------------------------------------------------+ | ONLY_FULL_GROUP_BY,STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_ENGINE_SUBSTITUTION | +-----------------------------------------------------------------------------------------------------------------------+
Yes, we can see the first item is ONLY_FULL_GROUP_BY
.
When ONLY_FULL_GROUP_BY
is enabled, MySQL rejects queries for which the select list, HAVING
condition, or ORDER BY
list refer to nonaggregated columns that are neither named in the GROUP BY
clause nor are functionally dependent on them.
We can either disable ONLY_FULL_GROUP_BY
or use ANY_VALUE()
to overcome this restriction and display results like in the above example. The ANY_VALUE()
function allows us to generate a result without changing our sql_mode
.
Just to be sure, here’s what happens if we add the Name
column to the GROUP BY
clause:
SELECT
District,
Name,
SUM(Population) FROM City
WHERE CountryCode = 'AUS'
GROUP BY District, Name;
Result:
+-----------------+---------------+-----------------+ | District | Name | SUM(Population) | +-----------------+---------------+-----------------+ | New South Wales | Sydney | 3276207 | | Victoria | Melbourne | 2865329 | | Queensland | Brisbane | 1291117 | | West Australia | Perth | 1096829 | | South Australia | Adelaide | 978100 | | Capital Region | Canberra | 322723 | | Queensland | Gold Coast | 311932 | | New South Wales | Newcastle | 270324 | | New South Wales | Central Coast | 227657 | | New South Wales | Wollongong | 219761 | | Tasmania | Hobart | 126118 | | Victoria | Geelong | 125382 | | Queensland | Townsville | 109914 | | Queensland | Cairns | 92273 | +-----------------+---------------+-----------------+
We no longer get the error, but it has completely screwed up our results in the process. We no longer get the total population for each district, which was our main goal. It’s basically outputting the results of our original query without the GROUP BY
clause. This demonstrates the benefit of the ANY_VALUE()
function.
That said, sometimes we might want to include all the cities in our query results (while also obtaining the total population of each district). In such cases, we could use the WITH ROLLUP
modifier to calculate the totals for each district.
Conclusion
The ANY_VALUE()
function in MySQL enables the inclusion of nonaggregated columns in the SELECT
clause without affecting the grouping behaviour. It serves as a handy option for enhancing query flexibility and addressing the challenges associated with nonaggregated columns in GROUP BY
queries.