Fix “SELECT DISTINCT ON expressions must match initial ORDER BY expressions” in PostgreSQL

If you’re getting a PostgreSQL error that reads “SELECT DISTINCT ON expressions must match initial ORDER BY expressions” when trying to run a query, it’s probably because the initial columns provided to your ORDER BY clause are different to the ones provided to the DISTINCT ON clause.

To fix this error, make sure the initial columns provided to the ORDER BY clause are included in the DISTINCT ON clause.

Example of Error

Here’s an example of code that produces the error:

SELECT 
    DISTINCT ON(FirstName, Age) 
    * 
FROM Dogs
ORDER BY DogId;

Result:

SELECT DISTINCT ON expressions must match initial ORDER BY expressions

I got the error because I passed the DogId column to the ORDER BY clause, but that column isn’t included in the DISTINCT ON clause.

I get the same error if I do the following:

SELECT 
    DISTINCT ON(FirstName, Age) 
    * 
FROM Dogs
ORDER BY DogId, FirstName, Age;

Result:

SELECT DISTINCT ON expressions must match initial ORDER BY expressions

So I still got the error, even though my ORDER BY clause includes the columns from the DISTINCT ON clause. The problem here is that I started my ORDER BY clause with a column that’s not in the DISTINCT ON clause.

Solution

To fix this issue, we need to ensure the initial columns provided to the ORDER BY clause are also included in the DISTINCT ON clause. If we include any other columns in our ORDER BY clause, they must come after the columns that are in the DISTINCT ON clause.

So here’s a quick fix for the above issue:

SELECT 
    DISTINCT ON(FirstName, Age) 
    * 
FROM Dogs
ORDER BY FirstName;

Result:

 dogid | firstname | lastname | age 
-------+-----------+----------+-----
     1 | Bark      | Smith    |  10
     2 | Wag       | Frazier  |   7

Here I simply replaced DogId with FirstName in the ORDER BY clause.

Not all columns in the DISTINCT ON clause need to be in the ORDER BY clause. But this is only true as long as we don’t provide any other columns to the ORDER BY clause. Doing that would result in the error again:

SELECT 
    DISTINCT ON(FirstName, Age) 
    * 
FROM Dogs
ORDER BY FirstName, DogId;

Result:

SELECT DISTINCT ON expressions must match initial ORDER BY expressions

In this case my ORDER BY clause uses FirstName, DogId but it doesn’t include the Age column, which is in my DISTINCT ON clause. If I want to include DogId in my ORDER BY clause, then it needs to follow all columns specified in the DISTINCT ON clause.

Therefore, we can fix this issue by doing the following:

SELECT 
    DISTINCT ON(FirstName, Age) 
    * 
FROM Dogs
ORDER BY FirstName, Age, DogId;

Result:

 dogid | firstname | lastname | age 
-------+-----------+----------+-----
     1 | Bark      | Smith    |  10
     2 | Wag       | Frazier  |   7

So all I did was insert the Age column in between FirstName and DogId.

In other words, my ORDER BY clause now starts with the same columns that my DISTINCT ON clause starts with.

I should point out that the following also works without error:

SELECT 
    DISTINCT ON(FirstName, Age) 
    * 
FROM Dogs
ORDER BY Age, FirstName, DogId;

Result:

 dogid | firstname | lastname | age 
-------+-----------+----------+-----
     2 | Wag       | Frazier  |   7
     1 | Bark      | Smith    |  10

All I did here is swap the first two columns around in my ORDER BY clause. So, even though the initial columns must also be in the DISTINCT ON clause, they don’t necessarily have to be in the same order.

However, we can’t do this:

SELECT 
    DISTINCT ON(FirstName, Age) 
    * 
FROM Dogs
ORDER BY FirstName, DogId, Age;

Result:

SELECT DISTINCT ON expressions must match initial ORDER BY expressions

That caused the error again. This is because my ORDER BY clause includes the DogId column before the Age column. As alluded to, if we want to specify non-DISTINCT ON columns in the ORDER BY clause, they must follow the columns provided in the DISTINCT ON clause.