34

I have two tables employee and phones. An employee can have 0 to n phone numbers. I want to list the employee names with their phone numbers. I am using the below query which runs fine.

SELECT empname,array_agg(phonenumber) AS phonenumbers 
FROM employee LEFT OUTER JOIN phones ON employee.empid = phones.empid
GROUP BY employee.empid

enter image description here

The employee table might contain large number of rows. I want to fetch only some employees at a time. For example I want to fetch 3 employees with their phone numbers. I am trying to run this query.

SELECT empname,array_agg(phonenumber) AS phonenumbers 
FROM 
(SELECT * FROM employee ORDER BY empname LIMIT 3 OFFSET 0) AS employee 
LEFT OUTER JOIN phones ON employee.empid = phones.empid
GROUP BY employee.empid

But I get this error. ERROR: column "employee.empname" must appear in the GROUP BY clause or be used in an aggregate function The only difference between two queries is that I am using sub query in the latter to limit the rows before joining. How do I solve this error?

Programmer
  • 443
  • 2
  • 5
  • 6

1 Answers1

33

The reason your first query works is because of a Postgres feature that allows you to use the primary key of a table with GROUP BY without explicitly specifying other columns of that table in the GROUP BY clause. It is relatively new and works only for base tables.

The optimizer is not (yet?) clever enough to identify primary keys for views, ctes or derived tables (as in your second case).

You can add the columns you want in the SELECT into the GROUP BY clause:

SELECT e.empname, array_agg(p.phonenumber) AS phonenumbers 
FROM 
(SELECT * FROM employee ORDER BY empname LIMIT 3 OFFSET 0) AS e 
LEFT OUTER JOIN phones AS p ON e.empid = p.empid
GROUP BY e.empid, e.empname 
ORDER BY e.empname ;

or use a subquery (and transfer the GROUP BY there):

SELECT e.empname,
       (SELECT array_agg(p.phonenumber) 
        FROM phones AS p
        WHERE e.empid = p.empid
       ) AS phonenumbers 
FROM 
(SELECT * FROM employee ORDER BY empname LIMIT 3 OFFSET 0) AS e 
ORDER BY e.empname ;

which could also be written as:

SELECT e.empname,
       (SELECT array_agg(p.phonenumber) 
        FROM phones AS p
        WHERE e.empid = p.empid
       ) AS phonenumbers 
FROM employee AS e
ORDER BY e.empname LIMIT 3 OFFSET 0 ;

Since you are in version 9.3+. you can also use a LATERAL join:

SELECT e.empname,
       p.phonenumbers 
FROM 
   (SELECT * FROM employee ORDER BY empname LIMIT 3 OFFSET 0) AS e
LEFT JOIN LATERAL
   (SELECT array_agg(phonenumber) AS phonenumbers
    FROM phones 
    WHERE e.empid = phones.empid
   ) AS p ON TRUE 
ORDER BY e.empname ;
ypercubeᵀᴹ
  • 99,450
  • 13
  • 217
  • 306