Assuming:
you need the set of rows with all columns like your query attempt suggests, not the count of rows like your title suggests.
"last messages", "before" etc. are defined by the id column in your example.
Careful: serial ID numbers are not reliably continuous quantities. Rows inserted "later" may have a lower ID under rare circumstances.
there can be multiple threads with trailing NULL rows for multiple different member_id in the same room_id.
You'll need a multicolumn index like this one in any case:
CREATE INDEX ON dialog_message(room_id, member_id, id DESC);
With a NOT EXISTS semi-join
For small tables or cases with only few stale rows with readtime IS NULL but a later row with readtime IS NOT NULL excluding those, I'd try a partial multicolumn index
You need this index in addition to the one above for best read performance:
CREATE INDEX ON dialog_message(room_id) WHERE readtime IS NULL;
Query:
SELECT d.*
FROM dialog_message d
WHERE d.room_id = 7992
AND d.readtime IS NULL
AND NOT EXISTS (
SELECT FROM dialog_message
WHERE room_id = d.room_id
AND member_id = d.member_id
AND readtime IS NOT NULL
AND id > d.id
);
Double recursive CTE
For big tables or cases with lots of stale rows with readtime IS NULL, I'd expect a recursive CTE to be fastest among pure SQL solutions:
WITH RECURSIVE cte AS ( -- get latest row for each member_id
(
SELECT *
FROM dialog_message
WHERE room_id = 7992
ORDER BY member_id, id DESC
LIMIT 1
)
UNION ALL
(
SELECT d.*
FROM cte
JOIN dialog_message d USING (room_id)
WHERE d.member_id > cte.member_id
ORDER BY d.member_id, id DESC
LIMIT 1
)
)
, cte2 AS (
SELECT *
FROM cte
WHERE readtime IS NULL -- IF not null, look for more ...
UNION ALL
(
SELECT d.*
FROM cte
JOIN dialog_message d USING (room_id, member_id)
WHERE d.id < cte.id
AND cte.readtime IS NULL
ORDER BY d.id DESC
LIMIT 1
)
)
SELECT *
FROM cte2
WHERE readtime IS NULL; -- trim the bounding row with readtime IS NOT NULL
Recursive CTE combined with PL/pgSQL function
Might be one of the rare cases where a procedural solution with a PL/pgSQL function is even faster because it can make do with a single index scan per call.
Create this function once:
CREATE OR REPLACE FUNCTION f_latest_unread_messages(_room_id int, _member_id int)
RETURNS SETOF dialog_message AS
$func$
DECLARE
_rec dialog_message;
BEGIN
FOR _rec IN
SELECT *
FROM dialog_message d
WHERE d.room_id = _room_id
AND d.member_id = _member_id
ORDER BY d.id DESC
LOOP
IF _rec.readtime IS NULL THEN
RETURN NEXT _rec;
ELSE
EXIT;
END IF;
END LOOP;
END
$func$ LANGUAGE plpgsql;
Combine the outer rCTE from above with the function:
WITH RECURSIVE cte AS (
(
SELECT *
FROM dialog_message
WHERE room_id = 7992
ORDER BY member_id, id DESC
LIMIT 1
)
UNION ALL
(
SELECT d.*
FROM cte
JOIN dialog_message d USING (room_id)
WHERE d.member_id > cte.member_id
ORDER BY d.member_id, id DESC
LIMIT 1
)
)
SELECT m.*
FROM cte c, f_latest_unread_messages(c.room_id, c.member_id) m
WHERE c.readtime IS NULL; -- trim rows with readtime NOT NULL immediately
db<>fiddle here
Related: