How to count last records from table where cell null before same cell is not null by order

Question

I have simple table with conversation.

  id   | room_id | member_id |       message        |          readtime          
-------+---------+-----------+----------------------+----------------------------
 81531 |    7992 |         1 | S:TestMessage        | 2018-07-21 17:27:39.844148
 81533 |    7992 |        16 | R:TestOK             | 2018-07-21 17:27:55.914063
 81545 |    7992 |        16 | R:hi                 | 
 81546 |    7992 |        16 | R:hi2                | 
 81547 |    7992 |        16 | R:ho1                | 
 81548 |    7992 |        16 | R:1111               | 2018-07-22 19:14:14.926177
 81550 |    7992 |        16 | R:33                 | 
 81551 |    7992 |        16 | R:33abc              | 
(8 rows)

As we can see, last 2 messages has not been read. For other cases, possible several last messages was not read or all messages has been read.

I have only room_id to filter messages.

My query is

SELECT id, room_id, member_id, message, readtime FROM dialog_message WHERE room_id = 7992

I see three conditions:

Last record must have readtime is null
Count from end while readtime is null
member_id like in last records

So, how can select only last unread messages, represented before id 81548?

I mean id 81550 and 81551.

81548 - has been read and above 81547-81545 read too, even readtime is null

PS: Sorry for my English.

Erwin Brandstetter · Accepted Answer · 2018-07-25T00:47:46.917

Assuming:

you need the set of rows with all columns like your query attempt suggests, not the count of rows like your title suggests.
"last messages", "before" etc. are defined by the id column in your example.
_{Careful: serial ID numbers are not reliably continuous quantities. Rows inserted "later" may have a lower ID under rare circumstances.}
there can be multiple threads with trailing NULL rows for multiple different member_id in the same room_id.

You'll need a multicolumn index like this one in any case:

CREATE INDEX ON dialog_message(room_id, member_id, id DESC);

With a `NOT EXISTS` semi-join

For small tables or cases with only few stale rows with readtime IS NULL but a later row with readtime IS NOT NULL excluding those, I'd try a partial multicolumn index

You need this index in addition to the one above for best read performance:

CREATE INDEX ON dialog_message(room_id) WHERE readtime IS NULL;

Query:

SELECT d.*
FROM   dialog_message d
WHERE  d.room_id = 7992
AND    d.readtime IS NULL
AND    NOT EXISTS (
   SELECT FROM dialog_message
   WHERE  room_id   = d.room_id
   AND    member_id = d.member_id
   AND    readtime IS NOT NULL
   AND    id > d.id
   );

Double recursive CTE

For big tables or cases with lots of stale rows with readtime IS NULL, I'd expect a recursive CTE to be fastest among pure SQL solutions:

WITH RECURSIVE cte AS (  -- get latest row for each member_id
   (
   SELECT *
   FROM   dialog_message
   WHERE  room_id = 7992
   ORDER  BY member_id, id DESC
   LIMIT  1
   )
   UNION ALL
   (
   SELECT d.*
   FROM   cte
   JOIN   dialog_message d USING (room_id)
   WHERE  d.member_id > cte.member_id
   ORDER  BY d.member_id, id DESC
   LIMIT  1
   )
   )
, cte2 AS (
   SELECT *
   FROM   cte
   WHERE  readtime IS NULL  -- IF not null, look for more ...

   UNION ALL
   (
   SELECT d.*
   FROM   cte
   JOIN   dialog_message d USING (room_id, member_id)
   WHERE  d.id < cte.id
   AND    cte.readtime IS NULL
   ORDER  BY d.id DESC
   LIMIT  1
   )
   )
SELECT *
FROM   cte2
WHERE  readtime IS NULL;  -- trim the bounding row with readtime IS NOT NULL

Recursive CTE combined with PL/pgSQL function

Might be one of the rare cases where a procedural solution with a PL/pgSQL function is even faster because it can make do with a single index scan per call.

Create this function once:

CREATE OR REPLACE FUNCTION f_latest_unread_messages(_room_id int, _member_id int)
   RETURNS SETOF dialog_message AS
$func$
DECLARE
   _rec dialog_message;
BEGIN
   FOR _rec IN
      SELECT *
      FROM   dialog_message d
      WHERE  d.room_id   = _room_id
      AND    d.member_id = _member_id
      ORDER  BY d.id DESC
   LOOP
      IF _rec.readtime IS NULL THEN
         RETURN NEXT _rec;
      ELSE
         EXIT;
      END IF;
   END LOOP;
END
$func$  LANGUAGE plpgsql;

Combine the outer rCTE from above with the function:

WITH RECURSIVE cte AS (
   (
   SELECT *
   FROM   dialog_message
   WHERE  room_id = 7992
   ORDER  BY member_id, id DESC
   LIMIT  1
   )
   UNION ALL
   (
   SELECT d.*
   FROM   cte
   JOIN   dialog_message d USING (room_id)
   WHERE  d.member_id > cte.member_id
   ORDER  BY d.member_id, id DESC
   LIMIT  1
   )
   )
SELECT m.*
FROM   cte c, f_latest_unread_messages(c.room_id, c.member_id) m
WHERE  c.readtime IS NULL;  -- trim rows with readtime NOT NULL immediately

db<>fiddle here

score 0 · Answer 2 · answered Jul 22 '18 at 23:20

I think your requirement can be described as: Retrieve every message for a given room where readtime IS NULL and no other, later message for the same room exists, that has a readtime, that IS NOT NULL or is from another user.

Assuming id defines an order on the messages, that leads to a query like the following.

SELECT dm1.id,
       dm1.room_id,
       dm1.member_id,
       dm1.message,
       dm1.readtime
       FROM dialog_message dm1
       WHERE dm1.room_id = 7992
             AND dm1.readtime IS NULL
             AND NOT EXISTS (SELECT *
                                    FROM dialog_message dm2
                                    WHERE dm2.room_id = dm1.room_id
                                          AND dm2.id > dm1.id
                                          AND (dm2.readtime IS NOT NULL
                                                OR dm2.member_id <> dm1.member_id));

db<>fiddle

score 0 · Answer 3 · answered Jul 23 '18 at 13:59

Your input as db fiddle, supposed id is serial

Desired output: 81550, 81551

In the same room_id = 7992 and order by readtime, querying ids which has value greater than the last id which has read-time is not null.

 SELECT id, room_id, member_id, message, readtime
  FROM dialog_message
  WHERE id > ( 
    SELECT id FROM dialog_message WHERE room_id = 7992 AND readtime is not null ORDER BY readtime desc LIMIT 1
    ) AND room_id = 7992;

@Dmitry: Please let's me know if it's not correct your requirement.

How to count last records from table where cell null before same cell is not null by order

3 Answers3

With a NOT EXISTS semi-join

Double recursive CTE

Recursive CTE combined with PL/pgSQL function

With a `NOT EXISTS` semi-join