Let's consider I have the following tables:
CREATE TABLE users(
id serial PRIMARY KEY,
age integer
)
CREATE TABLE products(
id serial PRIMARY KEY,
sku character varying(255),
user_id integer REFERENCES users
)
So basically a product would belong to a user and a user could have many products.
I want the ability to load users with their products doing the aggregation at the database level (Maybe wrongly but I assume it would be easier and probably more efficient than running some aggregation code in my application layer: Postgres comes with nice aggregation functions, why not using them !)
So I would run something like
SELECT "users".*, json_agg("products".*) as "products"
FROM "users" LEFT JOIN "products" ON "users"."id" = "products"."user_id"
GROUP BY "users"."id"
All good, my driver can parse JSON structures and I have nothing to do on the application layer side.
The problem rises if I want to introduce some sort of pagination:
WITH "users" AS (SELECT * FROM "users" ORDER BY "id" LIMIT 20)
SELECT "users".*, json_agg("products".*) as "products"
FROM "users" LEFT JOIN "products" ON "users"."id" = "products"."user_id"
GROUP BY "users"."id"
I have the known error
ERROR: column "users.age" must appear in the GROUP BY clause or be used in an aggregate function
As the "users"."id" is not considered as the primary key of the temporary table created by my subquery
I could fix it by adding every users column in the group by clause. But I find it troublesome and to be a shame as I am sure "users"."id" will define in a unique way an item of my subquery.
So, I would like to know if there is a way to tell the database engine "users"."id" is some sort of primary key for my subquery ?
If not, do you see a better way ?
EDIT: This question is quite similar (3 years old though)
Thanks