1

I have this query which is supposed to find the latest uploaded pids and reps of a user. The query looks like this:

SELECT usr.id, up.*, 
      (SELECT COALESCE(state, 'processing') AS pidState FROM PID WHERE userId = usr.id ORDER BY createdAt DESC LIMIT 1),
      (SELECT COALESCE(state, 'processing') AS repState FROM REP WHERE userId = usr.id ORDER BY createdAt DESC LIMIT 1),
      (SELECT createdAt AS pidCreatedAt FROM PID WHERE userId = usr.id ORDER BY createdAt DESC LIMIT 1),
      (SELECT createdAt AS repCreatedAt FROM REP WHERE userId = usr.id ORDER BY createdAt DESC LIMIT 1)
FROM User AS usr
JOIN UserProfile AS up ON up.userId = usr.id
ORDER BY pidCreatedAt DESC NULLS LAST, repCreatedAt DESC NULLS LAST

I tried joining PID and REP to the User but that gave me a load of returned rows since I only want the latest from each. Now I also have the need to return the PID and REP id fields, which would result in another subquery, so I am wondering if there's a smarter way to solve this other than to keep on writing sub queries for every column that I need in the future?

Erwin Brandstetter
  • 185,527
  • 28
  • 463
  • 633

1 Answers1

2

Based on a couple of assumptions, this might be the query you are looking for:

SELECT up.*
     , p.pidState, p.pidCreatedAt  -- more?
     , r.repState, r.repCreatedAt  -- more?
FROM   UserProfile AS up
LEFT   JOIN (
   SELECT DISTINCT ON (userId)
          userId
        , COALESCE(state, 'processing') AS pidState
        , createdAt AS pidCreatedAt
        -- , more columns from PID
   FROM   PID
   ORDER  BY userId, createdAt DESC NULLS LAST
   ) p USING (userId)
LEFT   JOIN (
   SELECT DISTINCT ON (userId)
          userId
        , COALESCE(state, 'processing') AS repState
        , createdAt AS repCreatedAt
        -- , more columns from REP
   FROM  REP
   ORDER BY userId, createdAt DESC NULLS LAST
   ) r USING (userId)
ORDER  BY pidCreatedAt DESC NULLS LAST, repCreatedAt DESC NULLS LAST

If referential integrity is guaranteed by a FK constraint, you can just omit the table User (invalid name) from the query.

While fetching all or most rows it typically makes sense to run subqueries with DISTINCT ON (or a related technique) on the whole tables PID and REP.

For a small selection, LATERAL subqueries will be more efficient.

See these related answers for detailed explanations:

Aside: use legal, lowercase, unquoted identifiers in Postgres if possible. Makes your life easier.

Erwin Brandstetter
  • 185,527
  • 28
  • 463
  • 633