1

I'm on the cusp of understanding this, but I don't understand the fix that I need. The following query takes about minute, give or take, to return 20 records:

explain (analyze, buffers)
select "search_docket"."id"
FROM "search_docket"
WHERE "search_docket"."court_id" = 'delctcompl'
ORDER BY "search_docket"."id" desC
LIMIT 20; 
                                                                           QUERY PLAN                                                                            
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.56..2791.05 rows=20 width=4) (actual time=74.950..41059.055 rows=20 loops=1)
   Buffers: shared hit=38524709 dirtied=10
   ->  Index Scan Backward using search_docket_pkey on search_docket  (cost=0.56..8837737.95 rows=63342 width=4) (actual time=74.947..41059.022 rows=20 loops=1)
         Filter: ((court_id)::text = 'delctcompl'::text)
         Rows Removed by Filter: 41862720
         Buffers: shared hit=38524709 dirtied=10
 Planning time: 0.450 ms
 Execution time: 41059.233 ms

Flipping the sort order makes it take about 60ms:

explain (analyze, buffers)
select "search_docket"."id"
FROM "search_docket"
WHERE "search_docket"."court_id" = 'delctcompl'
ORDER BY "search_docket"."id" asC
LIMIT 20; 
                                                                     QUERY PLAN                                                                      
-----------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.56..2791.05 rows=20 width=4) (actual time=63.701..63.939 rows=20 loops=1)
   Buffers: shared hit=45783
   ->  Index Scan using search_docket_pkey on search_docket  (cost=0.56..8837737.95 rows=63342 width=4) (actual time=63.698..63.933 rows=20 loops=1)
         Filter: ((court_id)::text = 'delctcompl'::text)
         Rows Removed by Filter: 67080
         Buffers: shared hit=45783
 Planning time: 0.426 ms
 Execution time: 63.971 ms

You can see that in the first one, it gets nasty right here:

         Rows Removed by Filter: 41862720
         Buffers: shared hit=38524709 dirtied=10

Here are the indexes on the table (omitting ones that are not related to this query):

Indexes:
    "search_docket_pkey" PRIMARY KEY, btree (id)
    "search_docket_7a46e69c" btree (court_id)
    "search_docket_court_id_2d2438b2594e74ba_like" btree (court_id varchar_pattern_ops)

Stray thoughts:

  1. The court column has very low cardinality. It has about 500 values across about 50M rows.

  2. I guess I could add a descending index on court_id, and maybe that'd fix it, but that doesn't seem right.

  3. Maybe I need a multi-column index here on search_docket.id and search_docket.court_id? Seems off.

Is there something better I should be doing here that would make the index work better?

Erwin Brandstetter
  • 185,527
  • 28
  • 463
  • 633
mlissner
  • 673
  • 1
  • 7
  • 20

2 Answers2

3

Ideal for those queries would be in index on (court_id, id), and with the columns in that order. It should be extremely fast in either direction. And once you have it, you should be able to get rid of the plain index on court_id as it wouldn't be much good anymore.

jjanes
  • 42,332
  • 3
  • 44
  • 54
0

The rows that satisfy the condition (estimated to be around 63000) all have a high id. PostgreSQL does not know that, so it considers the index scan equally appealing in either direction.

If you know it is always like that, use DESC. If not, and you want to avoid the index scan on that index, change the ORDER BY clause so that it doea not match the indexed expression, for example by adding + 0.

Laurenz Albe
  • 61,070
  • 4
  • 55
  • 90