4

In an application where a user can have copies of a mobile application installed on multiple devices I'd like to know how many iphone and how many android apps each user has.

I tried a query such as

gv2=> SELECT userid, 
       COUNT(ALL version_identifier LIKE '%ios%'), 
       COUNT(ALL version_identifier LIKE '%android%') 
FROM gl.user_device 
GROUP BY userid;
                userid                | count | count 
--------------------------------------+-------+-------
 46d0f5b7-42b0-4aad-9162-1390c32cb06e |     7 |     7
 5d519794-abfe-4863-82d4-6da33db7637b |     7 |     7
 a81cff6b-30f2-4b6e-a5bf-b1a933904473 |     1 |     1
 b65f0708-0cd1-11e7-878b-06fa189da46b |     4 |     4
 94b91b02-ff43-4a9a-b317-037fa2a347d3 |     1 |     1
 a4cacd98-1216-4801-b058-b28b8fa632a9 |     8 |     8
(6 rows)

Which is clearly wrong.

For example

gv2=> SELECT userid FROM gl.user_device WHERE version_identifier LIKE '%ios%';
                userid                
--------------------------------------
 5d519794-abfe-4863-82d4-6da33db7637b
(1 row)

In case it matters I need to support Postgres 10.1 but preferably also 9.5.10 as I have one old Production DB still on that version.

Johan
  • 623
  • 4
  • 8
  • 18

2 Answers2

9

Use the modern aggregate FILTER syntax in Postgres 9.4 or later:

SELECT userid,
       COUNT(*) FILTER (WHERE version_identifier LIKE '%ios%') AS nr_ios,
       COUNT(*) FILTER (WHERE version_identifier LIKE '%android%') AS nr_android
FROM   gl.user_device
GROUP  BY userid;

It's shorter, clearer and a bit faster. See:

But it still results in a sequential scan on the whole table.

If ...

  • the table is big and performance is important,
  • a substantial percentage of rows does not contribute to either count,
  • and you have a matching index - a trigram index on version_identifier in your case, see:

... then run two separate queries instead to tap the full potential of the index. Example with two subqueries in a FULL OUTER JOIN to make it a 100% equivalent drop-in replacement:

SELECT userid
     , COALESCE(nr_ios    , 0) AS nr_ios
     , COALESCE(nr_android, 0) AS nr_android
FROM  (
   SELECT userid, COUNT(*) AS nr_ios
   FROM   gl.user_device
   WHERE  version_identifier LIKE '%ios%'
   GROUP  BY 1
   ) i
FULL JOIN (
   SELECT userid, COUNT(*) AS nr_android
   FROM   gl.user_device
   WHERE  version_identifier LIKE '%android%'
   GROUP  BY 1
   ) a USING (userid);

Related:

Erwin Brandstetter
  • 185,527
  • 28
  • 463
  • 633
1

I found a solution here: https://stackoverflow.com/questions/21288458/in-redshift-postgres-how-to-count-rows-that-meet-a-condition

Basically:

SELECT userid,
       COUNT(CASE WHEN version_identifier like '%ios%' THEN 1 END) as nr_ios,
       COUNT(CASE WHEN version_identifier like '%android%' THEN 1 END) as nr_android
FROM gl.user_device
GROUP BY userid;

This appears to work on both PQ 9.5 and 10

Johan
  • 623
  • 4
  • 8
  • 18