1

I'm running two Postgres 15 (PostgreSQL 15.7 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 7.3.1 20180712 (Red Hat 7.3.1-12), 64-bit) RDS instances in AWS, one for my staging environment and one for my production environment. We're running a query that is taking a lot longer in the production envinroment than in staging one. The production envinroment even has less data than staging (at least in the selected/joined tables). Also, the production environment is not being heavily used, we're in an early testing stage, so there's basically one person using it at night for testing purpose.

This is the query:

SELECT arenas.id,
       arenas.display_name,
       arenas.cover_image_path,
       arenas.slug,
       addresses.zip_code,
       addresses.street,
       addresses.number,
       addresses.complement,
       addresses.district,
       addresses.latitude,
       addresses.longitude,
       cities.NAME                                       AS city_name,
       states.NAME                                       AS state_name,
       states.uf                                         AS state_uf,
       Array_to_json(Array_agg(DISTINCT ss2.sport_code)) AS available_sports,
       Earth_distance(Ll_to_earth (addresses.latitude, addresses.longitude),
       Ll_to_earth (-10.5555, -41.2751))                 AS
       meters_distance_between_user_and_arena
FROM   "arenas"
       INNER JOIN "addresses"
               ON "arenas"."id" = "addresses"."addressable_id"
                  AND "addresses"."addressable_type" = 'App\Models\Arena'
                  AND "addresses"."type" = 'COMMERCIAL'
                  AND Earth_distance(Ll_to_earth (addresses.latitude,
                                     addresses.longitude),
                          Ll_to_earth (-10.5555, -41.2751)) < 20000
       INNER JOIN "services"
               ON "services"."arena_id" = "arenas"."id"
                  AND "services"."status" = 'A'
                  AND "services"."deleted_at" IS NULL
                  AND "is_private" = false
       INNER JOIN "service_sport"
               ON "service_sport"."service_id" = "services"."id"
                  AND "service_sport"."sport_code" = 'BEACH_TENNIS'
       INNER JOIN "service_prices"
               ON "service_prices"."service_id" = "services"."id"
                  AND "service_prices"."is_default" = true
       INNER JOIN "field_service"
               ON "field_service"."service_id" = "services"."id"
       INNER JOIN "fields"
               ON "fields"."arena_id" = "arenas"."id"
                  AND "fields"."status" = 'A'
                  AND "fields"."deleted_at" IS NULL
       INNER JOIN "contacts"
               ON "contacts"."contactable_id" = "arenas"."id"
                  AND "contacts"."contactable_type" = 'App\Models\Arena'
                  AND "contacts"."is_main" = true
       INNER JOIN "field_time_slots"
               ON "field_time_slots"."arena_id" = "arenas"."id"
       INNER JOIN "cities"
               ON "cities"."ibge_code" = "addresses"."city_ibge_code"
       INNER JOIN "states"
               ON "states"."ibge_code" = "cities"."state_ibge_code"
       INNER JOIN "sports"
               ON "sports"."code" = "service_sport"."sport_code"
       INNER JOIN "service_sport" AS "ss2"
               ON "ss2"."arena_id" = "arenas"."id"
WHERE  "approved_at" IS NOT NULL
       AND EXISTS (SELECT *
                   FROM   "subscriptions"
                   WHERE  "arenas"."id" = "subscriptions"."arena_id"
                          AND "type" = 'access'
                          AND ( "ends_at" IS NULL
                                 OR ( "ends_at" IS NOT NULL
                                      AND "ends_at" > '2024-10-06 01:31:18' ) )
                          AND "stripe_status" != 'incomplete_expired'
                          AND "stripe_status" != 'unpaid'
                          AND "stripe_status" != 'past_due'
                          AND "stripe_status" != 'incomplete')
       AND "business_hours_data" IS NOT NULL
       AND "arenas"."deleted_at" IS NULL
GROUP  BY "arenas"."id",
          "arenas"."cover_image_path",
          "addresses"."latitude",
          "addresses"."longitude",
          "addresses"."zip_code",
          "addresses"."street",
          "addresses"."number",
          "addresses"."complement",
          "addresses"."district",
          "cities"."name",
          "states"."name",
          "states"."uf"
ORDER  BY "meters_distance_between_user_and_arena" ASC;

This is the EXPLAIN ANALYSE from production envinroment:

Sort  (cost=55657.12..55795.57 rows=55380 width=315) (actual time=563.084..563.104 rows=1 loops=1)
  Sort Key: (sec_to_gc(cube_distance((ll_to_earth((addresses.latitude)::double precision, (addresses.longitude)::double precision))::cube, '(3491544.0649759113, -4339378.172513269, -3108045.069568795)'::cube)))
  Sort Method: quicksort  Memory: 25kB
  ->  GroupAggregate  (cost=12417.08..43152.98 rows=55380 width=315) (actual time=563.077..563.097 rows=1 loops=1)
        Group Key: arenas.id, addresses.latitude, addresses.longitude, addresses.zip_code, addresses.street, addresses.number, addresses.complement, addresses.district, cities.name, states.name, states.uf
        ->  Sort  (cost=12417.08..12555.53 rows=55380 width=286) (actual time=222.049..445.141 rows=102240 loops=1)
              Sort Key: arenas.id, addresses.latitude, addresses.longitude, addresses.zip_code, addresses.street, addresses.number, addresses.complement, addresses.district, cities.name, states.name, states.uf
              Sort Method: external merge  Disk: 28144kB
              ->  Hash Join  (cost=17.39..668.95 rows=55380 width=286) (actual time=0.709..15.847 rows=102240 loops=1)
                    Hash Cond: (arenas.id = field_time_slots.arena_id)
                    ->  Hash Join  (cost=9.60..37.48 rows=260 width=382) (actual time=0.604..1.425 rows=480 loops=1)
                          Hash Cond: (arenas.id = ss2.arena_id)
                          ->  Nested Loop  (cost=7.39..32.21 rows=52 width=339) (actual time=0.523..1.121 rows=96 loops=1)
                                ->  Seq Scan on sports  (cost=0.00..1.16 rows=1 width=9) (actual time=0.006..0.009 rows=1 loops=1)
                                      Filter: (code = 'BEACH_TENNIS'::text)
                                      Rows Removed by Filter: 12
                                ->  Hash Join  (cost=7.39..30.53 rows=52 width=350) (actual time=0.515..1.070 rows=96 loops=1)
                                      Hash Cond: (cities.state_ibge_code = states.ibge_code)
                                      ->  Nested Loop  (cost=5.78..28.77 rows=52 width=340) (actual time=0.488..0.953 rows=96 loops=1)
                                            ->  Nested Loop  (cost=5.49..11.03 rows=52 width=332) (actual time=0.466..0.640 rows=96 loops=1)
                                                  Join Filter: (arenas.id = services.arena_id)
                                                  ->  Nested Loop  (cost=2.05..4.72 rows=4 width=305) (actual time=0.422..0.444 rows=4 loops=1)
                                                        Join Filter: (arenas.id = fields.arena_id)
                                                        ->  Nested Loop  (cost=2.05..3.62 rows=1 width=289) (actual time=0.412..0.417 rows=1 loops=1)
                                                              Join Filter: (arenas.id = addresses.addressable_id)
                                                              ->  Merge Join  (cost=2.05..2.08 rows=1 width=174) (actual time=0.023..0.027 rows=1 loops=1)
                                                                    Merge Cond: (arenas.id = contacts.contactable_id)
                                                                    ->  Sort  (cost=1.02..1.02 rows=1 width=158) (actual time=0.012..0.013 rows=1 loops=1)
                                                                          Sort Key: arenas.id
                                                                          Sort Method: quicksort  Memory: 25kB
                                                                          ->  Seq Scan on arenas  (cost=0.00..1.01 rows=1 width=158) (actual time=0.006..0.006 rows=1 loops=1)
                                                                                Filter: ((approved_at IS NOT NULL) AND (business_hours_data IS NOT NULL) AND (deleted_at IS NULL))
                                                                    ->  Sort  (cost=1.03..1.04 rows=1 width=16) (actual time=0.008..0.009 rows=1 loops=1)
                                                                          Sort Key: contacts.contactable_id
                                                                          Sort Method: quicksort  Memory: 25kB
                                                                          ->  Seq Scan on contacts  (cost=0.00..1.02 rows=1 width=16) (actual time=0.006..0.006 rows=1 loops=1)
                                                                                Filter: (is_main AND ((contactable_type)::text = 'App\Models\Arena'::text))
                                                                                Rows Removed by Filter: 1
                                                              ->  Seq Scan on addresses  (cost=0.00..1.52 rows=1 width=115) (actual time=0.386..0.387 rows=1 loops=1)
                                                                    Filter: (((addressable_type)::text = 'App\Models\Arena'::text) AND ((type)::text = 'COMMERCIAL'::text) AND (sec_to_gc(cube_distance((ll_to_earth((latitude)::double precision, (longitude)::double precision))::cube, '(3491544.0649759113, -4339378.172513269, -3108045.069568795)'::cube)) < '20000'::double precision))
                                                        ->  Seq Scan on fields  (cost=0.00..1.05 rows=4 width=16) (actual time=0.009..0.020 rows=4 loops=1)
                                                              Filter: ((deleted_at IS NULL) AND ((status)::text = 'A'::text))
                                                  ->  Materialize  (cost=3.43..5.57 rows=13 width=27) (actual time=0.011..0.035 rows=24 loops=4)
                                                        ->  Hash Join  (cost=3.43..5.50 rows=13 width=27) (actual time=0.041..0.092 rows=24 loops=1)
                                                              Hash Cond: (service_prices.service_id = service_sport.service_id)
                                                              ->  Seq Scan on service_prices  (cost=0.00..1.75 rows=36 width=8) (actual time=0.006..0.032 rows=36 loops=1)
                                                                    Filter: is_default
                                                                    Rows Removed by Filter: 39
                                                              ->  Hash  (cost=3.41..3.41 rows=2 width=51) (actual time=0.030..0.036 rows=3 loops=1)
                                                                    Buckets: 1024  Batches: 1  Memory Usage: 9kB
                                                                    ->  Hash Join  (cost=2.20..3.41 rows=2 width=51) (actual time=0.025..0.034 rows=3 loops=1)
                                                                          Hash Cond: (service_sport.service_id = services.id)
                                                                          ->  Hash Join  (cost=1.07..2.27 rows=3 width=27) (actual time=0.014..0.019 rows=3 loops=1)
                                                                                Hash Cond: (field_service.service_id = service_sport.service_id)
                                                                                ->  Seq Scan on field_service  (cost=0.00..1.13 rows=13 width=8) (actual time=0.003..0.004 rows=13 loops=1)
                                                                                ->  Hash  (cost=1.06..1.06 rows=1 width=19) (actual time=0.005..0.006 rows=1 loops=1)
                                                                                      Buckets: 1024  Batches: 1  Memory Usage: 9kB
                                                                                      ->  Seq Scan on service_sport  (cost=0.00..1.06 rows=1 width=19) (actual time=0.003..0.004 rows=1 loops=1)
                                                                                            Filter: (sport_code = 'BEACH_TENNIS'::text)
                                                                                            Rows Removed by Filter: 4
                                                                          ->  Hash  (cost=1.07..1.07 rows=4 width=24) (actual time=0.008..0.009 rows=4 loops=1)
                                                                                Buckets: 1024  Batches: 1  Memory Usage: 9kB
                                                                                ->  Seq Scan on services  (cost=0.00..1.07 rows=4 width=24) (actual time=0.004..0.006 rows=4 loops=1)
                                                                                      Filter: ((deleted_at IS NULL) AND (NOT is_private) AND ((status)::text = 'A'::text))
                                                                                      Rows Removed by Filter: 2
                                            ->  Memoize  (cost=0.29..8.31 rows=1 width=24) (actual time=0.001..0.001 rows=1 loops=96)
                                                  Cache Key: addresses.city_ibge_code
                                                  Cache Mode: logical
                                                  Hits: 95  Misses: 1  Evictions: 0  Overflows: 0  Memory Usage: 1kB
                                                  ->  Index Scan using cities_ibge_code_unique on cities  (cost=0.28..8.30 rows=1 width=24) (actual time=0.016..0.016 rows=1 loops=1)
                                                        Index Cond: (ibge_code = addresses.city_ibge_code)
                                      ->  Hash  (cost=1.27..1.27 rows=27 width=16) (actual time=0.020..0.020 rows=27 loops=1)
                                            Buckets: 1024  Batches: 1  Memory Usage: 10kB
                                            ->  Seq Scan on states  (cost=0.00..1.27 rows=27 width=16) (actual time=0.006..0.010 rows=27 loops=1)
                          ->  Hash  (cost=2.15..2.15 rows=5 width=43) (actual time=0.076..0.078 rows=5 loops=1)
                                Buckets: 1024  Batches: 1  Memory Usage: 9kB
                                ->  Nested Loop  (cost=1.03..2.15 rows=5 width=43) (actual time=0.070..0.074 rows=5 loops=1)
                                      Join Filter: (ss2.arena_id = subscriptions.arena_id)
                                      ->  HashAggregate  (cost=1.03..1.04 rows=1 width=16) (actual time=0.060..0.061 rows=1 loops=1)
                                            Group Key: subscriptions.arena_id
                                            Batches: 1  Memory Usage: 24kB
                                            ->  Seq Scan on subscriptions  (cost=0.00..1.02 rows=1 width=16) (actual time=0.008..0.009 rows=1 loops=1)
                                                  Filter: (((ends_at IS NULL) OR ((ends_at IS NOT NULL) AND (ends_at > '2024-10-06 01:31:18'::timestamp without time zone))) AND ((stripe_status)::text <> 'incomplete_expired'::text) AND ((stripe_status)::text <> 'unpaid'::text) AND ((stripe_status)::text <> 'past_due'::text) AND ((stripe_status)::text <> 'incomplete'::text) AND ((type)::text = 'access'::text))
                                      ->  Seq Scan on service_sport ss2  (cost=0.00..1.05 rows=5 width=27) (actual time=0.006..0.007 rows=5 loops=1)
                    ->  Hash  (cost=5.13..5.13 rows=213 width=16) (actual time=0.098..0.099 rows=213 loops=1)
                          Buckets: 1024  Batches: 1  Memory Usage: 18kB
                          ->  Seq Scan on field_time_slots  (cost=0.00..5.13 rows=213 width=16) (actual time=0.023..0.055 rows=213 loops=1)
Planning Time: 8.780 ms
Execution Time: 568.033 ms

This is the EXPLAIN ANALYSE from staging envinroment:

Sort  (cost=102.30..102.31 rows=2 width=405) (actual time=85.416..85.430 rows=1 loops=1)
  Sort Key: (sec_to_gc(cube_distance((ll_to_earth((addresses.latitude)::double precision, (addresses.longitude)::double precision))::cube, '(3491544.0649759113, -4339378.172513269, -3108045.069568795)'::cube)))
  Sort Method: quicksort  Memory: 25kB
  ->  GroupAggregate  (cost=101.18..102.29 rows=2 width=405) (actual time=85.406..85.420 rows=1 loops=1)
        Group Key: arenas.id, addresses.latitude, addresses.longitude, addresses.zip_code, addresses.street, addresses.number, addresses.complement, addresses.district, cities.name, states.name, states.uf
        ->  Sort  (cost=101.18..101.19 rows=2 width=397) (actual time=65.212..66.575 rows=10800 loops=1)
              Sort Key: arenas.id, addresses.latitude, addresses.longitude, addresses.zip_code, addresses.street, addresses.number, addresses.complement, addresses.district, cities.name, states.name, states.uf
              Sort Method: quicksort  Memory: 3448kB
              ->  Nested Loop  (cost=72.78..101.17 rows=2 width=397) (actual time=34.249..43.485 rows=10800 loops=1)
                    ->  Index Only Scan using sports_pkey on sports  (cost=0.15..8.17 rows=1 width=32) (actual time=0.019..0.024 rows=1 loops=1)
                          Index Cond: (code = 'BEACH_TENNIS'::text)
                          Heap Fetches: 1
                    ->  Hash Join  (cost=72.63..92.98 rows=2 width=429) (actual time=34.228..41.163 rows=10800 loops=1)
                          Hash Cond: (ss2.arena_id = arenas.id)
                          ->  Seq Scan on service_sport ss2  (cost=0.00..17.50 rows=750 width=48) (actual time=0.004..0.011 rows=5 loops=1)
                          ->  Hash  (cost=72.62..72.62 rows=1 width=493) (actual time=34.210..34.221 rows=2700 loops=1)
                                Buckets: 4096 (originally 1024)  Batches: 1 (originally 1)  Memory Usage: 1053kB
                                ->  Nested Loop  (cost=50.45..72.62 rows=1 width=493) (actual time=0.641..31.447 rows=2700 loops=1)
                                      ->  Nested Loop  (cost=50.30..72.44 rows=1 width=452) (actual time=0.629..23.566 rows=2700 loops=1)
                                            ->  Nested Loop Semi Join  (cost=50.01..64.14 rows=1 width=468) (actual time=0.617..7.491 rows=2700 loops=1)
                                                  Join Filter: (arenas.id = subscriptions.arena_id)
                                                  ->  Hash Join  (cost=49.88..61.77 rows=8 width=452) (actual time=0.605..2.421 rows=2700 loops=1)
                                                        Hash Cond: (field_time_slots.arena_id = arenas.id)
                                                        ->  Seq Scan on field_time_slots  (cost=0.00..10.23 rows=423 width=16) (actual time=0.004..0.047 rows=423 loops=1)
                                                        ->  Hash  (cost=49.86..49.86 rows=1 width=436) (actual time=0.589..0.596 rows=18 loops=1)
                                                              Buckets: 1024  Batches: 1  Memory Usage: 14kB
                                                              ->  Nested Loop  (cost=9.47..49.86 rows=1 width=436) (actual time=0.317..0.582 rows=18 loops=1)
                                                                    Join Filter: (arenas.id = contacts.contactable_id)
                                                                    Rows Removed by Join Filter: 18
                                                                    ->  Nested Loop  (cost=9.47..48.81 rows=1 width=420) (actual time=0.312..0.543 rows=18 loops=1)
                                                                          Join Filter: (services.id = service_sport.service_id)
                                                                          ->  Nested Loop  (cost=9.32..48.58 rows=1 width=412) (actual time=0.299..0.474 rows=58 loops=1)
                                                                                Join Filter: (services.id = field_service.service_id)
                                                                                ->  Nested Loop  (cost=0.57..34.38 rows=1 width=404) (actual time=0.289..0.377 rows=34 loops=1)
                                                                                      Join Filter: (arenas.id = fields.arena_id)
                                                                                      Rows Removed by Join Filter: 30
                                                                                      ->  Nested Loop  (cost=0.42..26.21 rows=1 width=388) (actual time=0.279..0.335 rows=16 loops=1)
                                                                                            Join Filter: (services.id = service_prices.service_id)
                                                                                            Rows Removed by Join Filter: 76
                                                                                            ->  Nested Loop  (cost=0.42..25.01 rows=1 width=380) (actual time=0.272..0.305 rows=4 loops=1)
                                                                                                  Join Filter: (addresses.addressable_id = arenas.id)
                                                                                                  ->  Nested Loop  (cost=0.28..16.84 rows=1 width=268) (actual time=0.262..0.288 rows=4 loops=1)
                                                                                                        Join Filter: (addresses.addressable_id = services.arena_id)
                                                                                                        Rows Removed by Join Filter: 4
                                                                                                        ->  Index Scan using addresses_addressable_type_addressable_id_index on addresses  (cost=0.14..8.67 rows=1 width=244) (actual time=0.254..0.272 rows=2 loops=1)
                                                                                                              Index Cond: ((addressable_type)::text = 'App\Models\Arena'::text)
                                                                                                              Filter: (((type)::text = 'COMMERCIAL'::text) AND (sec_to_gc(cube_distance((ll_to_earth((latitude)::double precision, (longitude)::double precision))::cube, '(3491544.0649759113, -4339378.172513269, -3108045.069568795)'::cube)) < '20000'::double precision))
                                                                                                        ->  Index Scan using services_arena_id_name_deleted_at_unique on services  (cost=0.14..8.16 rows=1 width=24) (actual time=0.004..0.006 rows=4 loops=2)
                                                                                                              Filter: ((NOT is_private) AND ((status)::text = 'A'::text))
                                                                                                              Rows Removed by Filter: 1
                                                                                                  ->  Index Scan using arenas_pkey on arenas  (cost=0.14..8.16 rows=1 width=112) (actual time=0.003..0.003 rows=1 loops=4)
                                                                                                        Index Cond: (id = services.arena_id)
                                                                                                        Filter: ((approved_at IS NOT NULL) AND (business_hours_data IS NOT NULL) AND (deleted_at IS NULL))
                                                                                            ->  Seq Scan on service_prices  (cost=0.00..1.12 rows=6 width=8) (actual time=0.002..0.004 rows=23 loops=4)
                                                                                                  Filter: is_default
                                                                                      ->  Index Scan using fields_arena_id_name_deleted_at_unique on fields  (cost=0.14..8.16 rows=1 width=16) (actual time=0.001..0.002 rows=4 loops=16)
                                                                                            Filter: ((status)::text = 'A'::text)
                                                                                ->  Bitmap Heap Scan on field_service  (cost=8.76..14.14 rows=5 width=8) (actual time=0.001..0.001 rows=2 loops=34)
                                                                                      Recheck Cond: (service_id = service_prices.service_id)
                                                                                      Heap Blocks: exact=34
                                                                                      ->  Bitmap Index Scan on field_service_arena_id_service_id_field_id_unique  (cost=0.00..8.76 rows=5 width=0) (actual time=0.001..0.001 rows=2 loops=34)
                                                                                            Index Cond: (service_id = service_prices.service_id)
                                                                          ->  Index Only Scan using service_sport_service_id_sport_code_unique on service_sport  (cost=0.15..0.22 rows=1 width=40) (actual time=0.001..0.001 rows=0 loops=58)
                                                                                Index Cond: ((service_id = field_service.service_id) AND (sport_code = 'BEACH_TENNIS'::text))
                                                                                Heap Fetches: 18
                                                                    ->  Seq Scan on contacts  (cost=0.00..1.04 rows=1 width=16) (actual time=0.001..0.001 rows=2 loops=18)
                                                                          Filter: (is_main AND ((contactable_type)::text = 'App\Models\Arena'::text))
                                                                          Rows Removed by Filter: 1
                                                  ->  Index Scan using subscriptions_arena_id_stripe_status_index on subscriptions  (cost=0.14..0.28 rows=1 width=16) (actual time=0.001..0.001 rows=1 loops=2700)
                                                        Index Cond: (arena_id = field_time_slots.arena_id)
                                                        Filter: (((ends_at IS NULL) OR ((ends_at IS NOT NULL) AND (ends_at > '2024-10-06 01:31:18'::timestamp without time zone))) AND ((stripe_status)::text <> 'incomplete_expired'::text) AND ((stripe_status)::text <> 'unpaid'::text) AND ((stripe_status)::text <> 'past_due'::text) AND ((stripe_status)::text <> 'incomplete'::text) AND ((type)::text = 'access'::text))
                                            ->  Index Scan using cities_ibge_code_unique on cities  (cost=0.28..8.30 rows=1 width=24) (actual time=0.005..0.005 rows=1 loops=2700)
                                                  Index Cond: (ibge_code = addresses.city_ibge_code)
                                      ->  Index Scan using states_pkey on states  (cost=0.15..0.18 rows=1 width=56) (actual time=0.002..0.002 rows=1 loops=2700)
                                            Index Cond: (ibge_code = cities.state_ibge_code)
Planning Time: 6.426 ms
Execution Time: 85.641 ms

Any idea why this might be happening? I didn't focus on performance because we're in an early stage and thought it wouldn't matter now because we have so little data.

We've already tried upgrading the server instance from a db.t3.micro to db.t3.small, nothing changed. We've also tried restoring it in another availability zone, nothing happened. I tried restoring the production dump locally and running the query, and it the costs 6000, but still, a lot less then 50000. When running the query in a local dev environment, it also costs 100.

Obviously, I can rewrite and improve this query, I'll do it very soon. But I'd really like to understand what is happening here.

EDIT: amount of data

WITH tbl AS
  (SELECT table_schema,
          TABLE_NAME
   FROM information_schema.tables
   WHERE TABLE_NAME not like 'pg_%'
     AND table_schema in ('public'))
SELECT sum((xpath('/row/c/text()', query_to_xml(format('select count(*) as c from %I.%I', table_schema, TABLE_NAME), FALSE, TRUE, '')))[1]::text::int) AS rows_n
FROM tbl
ORDER BY rows_n DESC;

Production amount of rows in each table: https://pastebin.com/7gj5PxRw

Staging amount of rows in each table: https://pastebin.com/507x3mP0

EDIT 2: Laurenz's suggestion actually helped, and the Execution Time has improved, however, I still can't understand why the plan had so many rows. I'd really like to deep dive into that. For now, I've separated this query into 2 distinct queries, and the performance has abruptly increased.

mtbossa
  • 13
  • 2

1 Answers1

3

The key difference is here:

Production:

->  Sort  (cost=12417.08..12555.53 rows=55380 width=286) (actual time=222.049..445.141 rows=102240 loops=1)
      Sort Key: arenas.id, addresses.latitude, addresses.longitude, addresses.zip_code, addresses.street, addresses.number, addresses.complement, addresses.district, cities.name, states.name, states.uf
      Sort Method: external merge  Disk: 28144kB

Staging:

->  Sort  (cost=101.18..101.19 rows=2 width=397) (actual time=65.212..66.575 rows=10800 loops=1)
      Sort Key: arenas.id, addresses.latitude, addresses.longitude, addresses.zip_code, addresses.street, addresses.number, addresses.complement, addresses.district, cities.name, states.name, states.uf
      Sort Method: quicksort  Memory: 3448kB

The sort has to process ten times as many rows on the production system (102240 vs. 10800). On top of that, sorting the 10800 rows on the staging system can happen in memory, while the production system has to write and read temporary files, because the 100000 rows don't fit into work_mem.

You can increase performance on the production system be increasing work_mem (but take care not to run out of memory).

From PostgreSQL v16 on, the fast plan might be even faster, because commit b592422095 added the ability to use an "incremental sort" for GROUP BY. Even better, if you increase work_mem enough, you might get a faster "hash aggregate".

Laurenz Albe
  • 61,070
  • 4
  • 55
  • 90