3

I have function with CTE which looks like this:

function param(p_userId, p_templateId)

    with a(select...),
         b (select..), 
         c (insert..returning..), 
         d (update..returning..),
         ...k(select..)
    insert ... 
    select * from k 
    where  userId = p_userId 
    and    templateId = p_templateId

I'm able to put the where clause right at the begining in the a CTE. If I place the condition at the end like I have it now, will it filtered the rows out at the earliest or at the latest time? What is faster?

Basically what I do is, in the CTE's which appear sooner, like the a or b, (rather then k), I'm prefetching data or precalculating columns in order to use them later in the following CTEs.

Hope this question is still not too vague.

Daniel Katz
  • 173
  • 1
  • 2
  • 6

2 Answers2

5

CTEs pose as optimization fences in Postgres.

Each CTE is planned, executed and materialized separately. The predicates of a later CTE are not visible in earlier CTEs. Place conditions to eliminate rows as early as possible (a or b in your example).

Or rewrite with subqueries to allow more aggressive optimizations in Postgres. (But you should still eliminate irrelevant rows as early as possible, optimization only goes so far ...)

Details in Craig's blog post:

This changed with Postgres 12. See:

Erwin Brandstetter
  • 185,527
  • 28
  • 463
  • 633
4

Update: Since v12, PostgreSQL automatically folds many CTEs into the parent query and pushes predicates to the optimal levels. The accepted answer is still robust for all CTEs, but may not be necessary for your case. Your psuedo-code doesn't give enough details for a definitive answer.

https://www.postgresql.org/docs/12/queries-with.html

However, if a WITH query is non-recursive and side-effect-free (that is, it is a SELECT containing no volatile functions) then it can be folded into the parent query, allowing joint optimization of the two query levels. By default, this happens if the parent query references the WITH query just once, but not if it references the WITH query more than once. You can override that decision by specifying MATERIALIZED to force separate calculation of the WITH query, or by specifying NOT MATERIALIZED to force it to be merged into the parent query. The latter choice risks duplicate computation of the WITH query, but it can still give a net savings if each usage of the WITH query needs only a small part of the WITH query's full output.