2

I'm making a FDW (Foreign Data Wrapper) in PostgreSQL for myself using C. And in the test environment below, myFDW receives the query with/without LIMIT clause case by case.

Test environment:

psql → PC1(Postgresql + postgres_fdw) → PC2(Postgresql + myFDW) → tableA@myUniqueDB

tableA:

  • ts as timestamp
  • data1 as integer

Such as when I type the following in psql:

select ts
from tableA
where (ts > '2022-01-01')
  and (ts < '2022-12-31')
order by ts desc
limit 10;

myFDW receives

select ts
from tableA
where (ts > '2022-01-01')
  and (ts < '2022-12-31')
limit 10;

and when I type the following in psql:

select ts,
       data1
from tableA
where (ts > '2022-01-01')
  and (ts < '2022-12-31')
order by ts desc
limit 10;

myFDW receives

select ts
from tableA
where (ts > '2022-01-01')
  and (ts < '2022-12-31');

1 Answers1

0

postgres_fdw is not guaranteed to push down everything it can. During query planning, it will try different combinations and estimate the cost. Eventually, it will pick the execution plan that promises the shortest execution time.

To come up with good estimates, you either have to explicitly run ANALYZE on the foreign table, so that PostgreSQL gathers statistics for it, or you have to set the foreign table option use_remote_estimate 'on' so that each query runs an EXPLAIN on the remote side.

The quality of the latter method depends on the quality of the estimates on PC2 and on the statistics of your foreign table. ANALYZE should work in any case, since it fetches actual data from the foreign table.

With a complicated architecture like that, it is not unsurprising if some information is lost in transit.

Laurenz Albe
  • 61,070
  • 4
  • 55
  • 90