Understanding the ARRAY Constructor in PostgreSQL

In PostgreSQL we can create arrays with the ARRAY constructor. An array constructor is an expression that builds an array value using values for its member elements.

Example

Here’s a basic example of an array constructor:

SELECT ARRAY[1,2,3];

Result:

  array  
---------
{1,2,3}

With the ARRAY constructor, we pass a comma separated list of values enclosed in square brackets. These values become the member elements for the resulting array.

Here’s another example:

SELECT ARRAY[ 'Cat', 'Dog', 'Horse' ];

Result:

      array      
-----------------
{Cat,Dog,Horse}

Expressions

We can use expressions such as the following in the list of array elements:

SELECT ARRAY[1,2,3+7];

Result:

  array   
----------
{1,2,10}

The expression is evaluated/calculated and the result is returned in the resulting array.

Explicit Casting

By default, the array element type is the common type of the member expressions, determined using the same rules as for UNION or CASE constructs. However, we also have the option of explicitly casting the array to the desired type.

SELECT ARRAY[1,2,3]::integer[];

Result:

  array  
---------
{1,2,3}

Here we can force the array into a type that it wouldn’t otherwise have arrived at. For example:

SELECT ARRAY[1,2,3.75]::integer[];

Result:

  array  
---------
{1,2,4}

In this case one of the arguments to the array constructor was 3.75, but that was rounded up into an integer.

Here it is again but being cast to the numeric type:

SELECT ARRAY[1,2,3.75]::numeric[];

Result:

   array    
------------
{1,2,3.75}

This time it remains as 3.75.

Multi-Dimensional Arrays

We can use a single ARRAY constructor to create a multi-dimensional array by simply embedding each child array in its own set of square brackets:

SELECT ARRAY[ [1,2,3], [4,5,6], [7,8,9] ];

Result:

           array           
---------------------------
{{1,2,3},{4,5,6},{7,8,9}}

We can get the same result by explicitly using ARRAY constructors for the child arrays:

SELECT ARRAY[ ARRAY[1,2,3], ARRAY[4,5,6], ARRAY[7,8,9] ];

Result:

           array           
---------------------------
{{1,2,3},{4,5,6},{7,8,9}}

One thing to be mindful of when creating multi-dimensional arrays is that inner constructors at the same level must produce sub-arrays of identical dimensions.

Therefore, the following won’t work:

SELECT ARRAY[ [1,2,3], [4,5,6,7] ];

Result:

ERROR:  multidimensional arrays must have array expressions with matching dimensions

This is because multidimensional arrays must be rectangular. In this example the first inner array has three elements while the second has four. They must both contain the same number of elements.

Empty Arrays

We can create empty arrays, but we need to explicitly specify the array type when we do this. This is because arrays must have a type – it’s impossible to have an array with no type.

Example:

SELECT ARRAY[]::integer[];

Result:

 array 
-------
{}

We can see that this successfully produced an empty array.

Here’s what happens when we don’t explicitly specify the array type:

SELECT ARRAY[];

Result:

ERROR:  cannot determine type of empty array
LINE 1: SELECT ARRAY[];
^
HINT: Explicitly cast to the desired type, for example ARRAY[]::integer[].

As expected, an error occurs.

Construct an Array from a Subquery

We can construct arrays from the results of a subquery.

Example:

SELECT ARRAY(
    SELECT CatName FROM Cats
);

Result:

         array         
-----------------------
{Fluffy,Scratch,Purr}

Notice that we used parentheses instead of square brackets in this case. This is required. Here’s what happens if we try to use square brackets:

SELECT ARRAY[
    SELECT CatName FROM Cats
];

Result:

ERROR:  syntax error at or near "SELECT"
LINE 2: SELECT CatName FROM Cats
^

Also, the subquery must return only one column. If it returns more than one column we get an error:

SELECT ARRAY(
    SELECT CatId, CatName FROM Cats
);

Result:

ERROR:  subquery must return only one column
LINE 1: SELECT ARRAY(
^

Construct a Multi-Dimensional Array from Subqueries

We can also create multi-dimensional arrays from subqueries:

Suppose we have the following table:

SELECT * FROM Pets;

Result:

 petid | petname | pettype 
-------+---------+---------
1 | Fluffy | Cat
2 | Scratch | Cat
3 | Purr | Cat
4 | Bark | Dog
5 | Wag | Dog
6 | Fetch | Dog

We can use this data to create a multi-dimensional array:

SELECT ARRAY[
ARRAY(SELECT PetName FROM Pets WHERE PetType = 'Dog'),
ARRAY(SELECT PetName FROM Pets WHERE PetType = 'Cat')
];

Result:

                  array                   
------------------------------------------
{{Bark,Wag,Fetch},{Fluffy,Scratch,Purr}}

In this case we separated the inner arrays by pet type; one array contains cats and the other dogs.

Subscript Numbering

The subscripts of an array value built with ARRAY always begin with 1.

We can return the subscripts with the generate_subscripts() function:

SELECT generate_subscripts(ARRAY[ 'Cat', 'Dog', 'Horse' ], 1);

Result:

 generate_subscripts 
---------------------
1
2
3

Implicit Determination of the Array Type

As mentioned, if we don’t explicitly cast the array to a certain type, PostgreSQL works out the type for us. In this case the array element type is the common type of the member expressions, determined using the same rules as for UNION or CASE constructs.

Here are those rules:

  1. If all inputs are of the same type, and it is not unknown, resolve as that type.
  2. If any input is of a domain type, treat it as being of the domain’s base type for all subsequent steps.
  3. If all inputs are of type unknown, resolve as type text. Otherwise, unknown inputs are ignored for the purposes of the remaining rules.
  4. If the non-unknown inputs are not all of the same type category, fail.
  5. Select the first non-unknown input type as the candidate type, then consider each other non-unknown input type, left to right. If the candidate type can be implicitly converted to the other type, but not vice-versa, select the other type as the new candidate type. Then continue considering the remaining inputs. If, at any stage of this process, a preferred type is selected, stop considering additional inputs.
  6. Convert all inputs to the final candidate type. Fail if there is not an implicit conversion from a given input type to the candidate type.

Source: PostgreSQL documentation