I have a database where rows in table A could have N >= 0 associated rows in table B. I am trying to construct a query that returns all of the columns of A with an array of all of the associated rows in table B. It looks like this should be possible according to the documentation.
It looks like I want case 4 aggregate_name ( * ) [ FILTER ( WHERE filter_clause ) ]
but when I try to use array_agg(*)
it results in an error:
No function matches the given name and argument types. You might need to add explicit type casts.
What am I missing here? I have seen the string concatenation approach but if I could just select everything without having to specify columns that would be easier I think.
SQL:
CREATE TABLE containers (
id bigint PRIMARY KEY
);
CREATE TABLE elements (
id bigint PRIMARY KEY,
container_id bigint REFERENCES containers(id)
);
insert into containers VALUES (1), (2), (3);
insert into elements VALUES (1, 1), (2, 1), (3, 2);
SELECT *
FROM containers
JOIN (
SELECT elements.id, array_agg(*)
FROM elements
GROUP BY elements.id
) AS elements ON elements.container_id=containers.id
db_fiddle:
I have a database where rows in table A could have N >= 0 associated rows in table B. I am trying to construct a query that returns all of the columns of A with an array of all of the associated rows in table B. It looks like this should be possible according to the documentation.
It looks like I want case 4 aggregate_name ( * ) [ FILTER ( WHERE filter_clause ) ]
but when I try to use array_agg(*)
it results in an error:
No function matches the given name and argument types. You might need to add explicit type casts.
What am I missing here? I have seen the string concatenation approach but if I could just select everything without having to specify columns that would be easier I think.
SQL:
CREATE TABLE containers (
id bigint PRIMARY KEY
);
CREATE TABLE elements (
id bigint PRIMARY KEY,
container_id bigint REFERENCES containers(id)
);
insert into containers VALUES (1), (2), (3);
insert into elements VALUES (1, 1), (2, 1), (3, 2);
SELECT *
FROM containers
JOIN (
SELECT elements.id, array_agg(*)
FROM elements
GROUP BY elements.id
) AS elements ON elements.container_id=containers.id
db_fiddle: https://dbfiddle.uk/C8XgWVLW
Share Improve this question edited Mar 3 at 20:11 Cogito Ergo Sum asked Mar 3 at 19:39 Cogito Ergo SumCogito Ergo Sum 8722 gold badges11 silver badges24 bronze badges 3 |3 Answers
Reset to default 2You don't share your desired output for this sample data, so I'm sharing two options.
One, using array_agg
which will give a less descriptive output as the column names are not preserved, merely the column ordinals:
SELECT *
FROM containers
JOIN (
SELECT container_id, array_agg(elements)
FROM elements
GROUP BY container_id
) AS elements ON elements.container_id=containers.id
id | container_id | array_agg |
---|---|---|
1 | 1 | {"(1,1)","(2,1)"} |
2 | 2 | {"(3,2)"} |
https://dbfiddle.uk/8XJ4FlaE
And another using json_agg()
which serializes the data with the column names:
SELECT *
FROM containers
JOIN (
SELECT elements.container_id, json_agg(elements)
FROM elements
GROUP BY elements.container_id
) AS elements ON elements.container_id=containers.id
id | container_id | json_agg |
---|---|---|
1 | 1 | [{"id":1,"container_id":1}, {"id":2,"container_id":1}] |
2 | 2 | [{"id":3,"container_id":2}] |
https://dbfiddle.uk/h0HuyzB5
You'll notice two key changes from your attempt:
- We are selecting and grouping on
container_id
in your subquery since that is your join criteria and desired level of granularity. - We supply the table name (or alias) to the aggregation function so that all columns are included in the array aggregation.
Lastly, both of these can be rewritten without the subquery:
SELECT c.id, json_agg(e)
FROM containers c
INNER JOIN elements e
ON c.id = e.container_id
GROUP BY c.id
You'll aggregate complete records, not a list of fields of the record: that is, pass the table name instead of *
.
Additionally, you seem to want to group by container_id.
All in all, the query you'll like may be:
SELECT *
FROM containers
JOIN (
SELECT container_id, array_agg(elements) elements
FROM elements
GROUP BY container_id
) AS elements ON elements.container_id=containers.id;
(see it in your fiddle, adapted)
SELECT id, array_agg((e.*))
FROM containers
JOIN elements AS e(eid,id)
USING(id)
GROUP BY id;
id | array_agg |
---|---|
1 | {"(1,1)","(2,1)"} |
2 | {"(3,2)"} |
demo at db<>fiddle
Your array_agg(*)
concept isn't incorrect but that's not the right, or complete syntax for it.
An asterisk *
as an aggregate function argument doesn't mean the same thing as it does in a select
list: I take that you expected it to be interpreted as all columns, the way it works when used directly in the select list. Meanwhile, as an aggregate function argument, it means the opposite: no columns.
arg_data_type
An input data type on which this aggregate function operates. To create a zero-argument aggregate function, write*
in place of the list of argument specifications. (An example of such an aggregate iscount(*)
.)
It's also mentioned in the doc that you linked:
The fourth form invokes the aggregate once for each input row; since no particular input value is specified, it is generally only useful for the
count(*)
aggregate function.
There's no zero-argument version of the array_agg()
function defined in a standard PostgreSQL build, so an array_agg(*)
won't work. Note how the error message you got says the function it couldn't find is a function with no arguments (empty parentheses instead of an asterisk or a list of types corresponding to all fields):
ERROR: function array_agg() does not exist LINE 4: SELECT elements.id, array_agg(*) ^ HINT: No function matches the given name and argument types. You might need to add explicit type casts.
There is only a single-element version, and you can by all means pass whole rows into it, as long as you parenthesise and qualify the *
expression, like in the example up top.
Aliasing the elements
fields to match join column names enables JOIN..USING
and stops it from listing them twice.
There's a slight advantage of array_agg((e.*))
over already mentioned array_agg(e)
: the former lets you mix things in. If you don't add anything, it works exactly the same:
SELECT id
, array_agg(e)
, array_agg((e))
, array_agg((e.*))
FROM containers
JOIN elements AS e(eid,id)
USING(id)
GROUP BY id;
id | array_agg | array_agg | array_agg |
---|---|---|---|
1 | {"(1,1)","(2,1)"} | {"(1,1)","(2,1)"} | {"(1,1)","(2,1)"} |
2 | {"(3,2)"} | {"(3,2)"} | {"(3,2)"} |
If you do add something, the tablename-as-select-expression syntax will nest the record:
SELECT id
, array_agg((e.* , 'x' , id*e.eid))
, array_agg((e , 'x' , id*e.eid))
FROM containers
JOIN elements AS e(eid,id)
USING(id)
GROUP BY id;
id | array_agg | array_agg |
---|---|---|
1 | {"(1,1,x,1)","(2,1,x,2)"} | {"("(1,1)",x,1)","("(2,1)",x,2)"} |
2 | {"(3,2,x,6)"} | {"("(3,2)",x,6)"} |
Keep in mind that if you do mix something in, the resulting type will no longer be a regular array type elements[]
but a pseudo-type record[]
and those cannot be used in column definitions (e.g. create table..as..
and create view..as..
will refuse to accept a query with a pseudo-type column).
[1, 1]
[2, 1]
etc? What if the columns ofelements
were different types, such as strings or dates? – Charlieface Commented Mar 3 at 21:04