1

Table setup:

create table users(id bigserial, group_id bigint);
insert into users(group_id) values (1), (1), (2), (1), (3), (3);

Query:

    WITH RECURSIVE r AS (
      SELECT users.id, users.group_id, 0 as i
      FROM users

      UNION 

      SELECT users.id, users.group_id
          , (CASE WHEN users.group_id = lag(users.group_id) over (order by users.id) 
                  THEN (r.i) 
                  ELSE (r.i+1) END) AS i 
       FROM users
       INNER JOIN r ON users.id = r.id
       WHERE r.i < 5)
    SELECT * FROM r;

I'm trying to iterate through user, and assign i if previous group_id value is equal to current, i+1 otherwise, starting from 0.

However, this recursive query doesn't work as I expect, see: http://sqlfiddle.com/#!15/ad3b6/39

Erwin Brandstetter
  • 185,527
  • 28
  • 463
  • 633
lissp
  • 11
  • 2

1 Answers1

2

Simpler and faster (and correct) with window functions and a subquery:

SELECT id, group_id, count(step) OVER (ORDER BY id) AS i
FROM  (
   SELECT id, group_id
        , lag(group_id, 1, group_id) OVER (ORDER BY id) <> group_id OR NULL AS step
   FROM   users
   ) sub;
id|group_id| i
--+--------+--
 1|       1| 0
 2|       1| 0
 3|       2| 1
 4|       1| 2
 5|       3| 3
 6|       3| 3

It's the same basic logic as discussed and explained in depth here (just simpler, without aggregation of rows):

There are also working solutions with a recursive CTE like you tried under the linked question, but those are substantially slower and more complex.

Andthere is also a plpgsql solution. Also more complex, but fastest.
Here is an optimized variant for your simple case without aggregation:

CREATE OR REPLACE FUNCTION f_users_i()
  RETURNS TABLE (id bigint, group_id bigint, i bigint) AS
$func$
DECLARE
   _last_group_id bigint;                 -- aux var to remember last group
BEGIN
   i := 0;                                -- init to 0

   FOR id, group_id IN
      SELECT u.id, u.group_id FROM users u ORDER BY u.id
   LOOP
      IF group_id <> _last_group_id THEN  -- next group starts
         i := i + 1;
      END IF;

      _last_group_id := group_id;         -- remember last group
      RETURN NEXT;
   END LOOP;
END
$func$  LANGUAGE plpgsql;

Call:

SELECT * FROM f_users_i();

Same result.

dbfiddle here (Postgres 9.6).
SQL Fiddle (the same, based on your fiddle on Postgres 9.3).

Aside: I doubt you need bigint for users and groups. Can probably all be just integer.

Erwin Brandstetter
  • 185,527
  • 28
  • 463
  • 633