If you’re getting a PostgreSQL error that reads “SELECT DISTINCT ON expressions must match initial ORDER BY expressions” when trying to run a query, it’s probably because the initial columns provided to your ORDER BY
clause are different to the ones provided to the DISTINCT ON
clause.
To fix this error, make sure the initial columns provided to the ORDER BY
clause are included in the DISTINCT ON
clause.
Example of Error
Here’s an example of code that produces the error:
SELECT
DISTINCT ON(FirstName, Age)
*
FROM Dogs
ORDER BY DogId;
Result:
SELECT DISTINCT ON expressions must match initial ORDER BY expressions
I got the error because I passed the DogId
column to the ORDER BY
clause, but that column isn’t included in the DISTINCT ON
clause.
I get the same error if I do the following:
SELECT
DISTINCT ON(FirstName, Age)
*
FROM Dogs
ORDER BY DogId, FirstName, Age;
Result:
SELECT DISTINCT ON expressions must match initial ORDER BY expressions
So I still got the error, even though my ORDER BY
clause includes the columns from the DISTINCT ON
clause. The problem here is that I started my ORDER BY
clause with a column that’s not in the DISTINCT ON
clause.
Solution
To fix this issue, we need to ensure the initial columns provided to the ORDER BY
clause are also included in the DISTINCT ON
clause. If we include any other columns in our ORDER BY
clause, they must come after the columns that are in the DISTINCT ON
clause.
So here’s a quick fix for the above issue:
SELECT
DISTINCT ON(FirstName, Age)
*
FROM Dogs
ORDER BY FirstName;
Result:
dogid | firstname | lastname | age -------+-----------+----------+----- 1 | Bark | Smith | 10 2 | Wag | Frazier | 7
Here I simply replaced DogId
with FirstName
in the ORDER BY
clause.
Not all columns in the DISTINCT ON
clause need to be in the ORDER BY
clause. But this is only true as long as we don’t provide any other columns to the ORDER BY
clause. Doing that would result in the error again:
SELECT
DISTINCT ON(FirstName, Age)
*
FROM Dogs
ORDER BY FirstName, DogId;
Result:
SELECT DISTINCT ON expressions must match initial ORDER BY expressions
In this case my ORDER BY
clause uses FirstName, DogId
but it doesn’t include the Age
column, which is in my DISTINCT ON
clause. If I want to include DogId
in my ORDER BY
clause, then it needs to follow all columns specified in the DISTINCT ON
clause.
Therefore, we can fix this issue by doing the following:
SELECT
DISTINCT ON(FirstName, Age)
*
FROM Dogs
ORDER BY FirstName, Age, DogId;
Result:
dogid | firstname | lastname | age -------+-----------+----------+----- 1 | Bark | Smith | 10 2 | Wag | Frazier | 7
So all I did was insert the Age
column in between FirstName
and DogId
.
In other words, my ORDER BY
clause now starts with the same columns that my DISTINCT ON
clause starts with.
I should point out that the following also works without error:
SELECT
DISTINCT ON(FirstName, Age)
*
FROM Dogs
ORDER BY Age, FirstName, DogId;
Result:
dogid | firstname | lastname | age -------+-----------+----------+----- 2 | Wag | Frazier | 7 1 | Bark | Smith | 10
All I did here is swap the first two columns around in my ORDER BY
clause. So, even though the initial columns must also be in the DISTINCT ON
clause, they don’t necessarily have to be in the same order.
However, we can’t do this:
SELECT
DISTINCT ON(FirstName, Age)
*
FROM Dogs
ORDER BY FirstName, DogId, Age;
Result:
SELECT DISTINCT ON expressions must match initial ORDER BY expressions
That caused the error again. This is because my ORDER BY
clause includes the DogId
column before the Age
column. As alluded to, if we want to specify non-DISTINCT ON
columns in the ORDER BY
clause, they must follow the columns provided in the DISTINCT ON
clause.