4

I want to return my post rows in one of three sort orders, controlled by a sorting_param function parameter. The sorting is complex, doesn't lend itself to a simple ORDER BY CASE .... So I'm thinking of writing each sorting into its own CTE, and then picking one of the CTEs as the final output. However, I wouldn't want PostgreSQL to waste time on all three, just the one I selected. So will all three be materialized or only the one I picked? Or is there a better way to do this? PostgreSQL version 16. Simplified code below.

WITH posts_by_order1 AS
(
    SELECT p.id
    FROM post p
    ORDER BY [some complex ordering 1]
    LIMIT 10
),
posts_by_order2 AS
(
    SELECT p.id
    FROM post p
    ORDER BY [some complex ordering 2]
    LIMIT 10
),
posts_by_order3 AS
(
    SELECT p.id
    FROM post p
    ORDER BY [some complex ordering 3]
    LIMIT 10
)
SELECT * FROM posts_by_order1 WHERE sorting_param = 0
UNION ALL
SELECT * FROM posts_by_order2 WHERE sorting_param = 1
UNION ALL
SELECT * FROM posts_by_order3 WHERE sorting_param = 2;

Unfortunately I can't do this:

    ORDER BY 
        CASE 
            WHEN sorting_param = 0 THEN p.date 
            WHEN sorting_param = 1 THEN p.name 
            ELSE NULL 
        END DESC, ...
Erwin Brandstetter
  • 185,527
  • 28
  • 463
  • 633

1 Answers1

4

The manual:

Data-modifying statements in WITH are executed exactly once, and always to completion, independently of whether the primary query reads all (or indeed any) of their output. Notice that this is different from the rule for SELECT in WITH: as stated in the previous section, execution of a SELECT is carried only as far as the primary query demands its output.

Bold emphasis mine.

The optional key word MATERIALIZED has no impact on this. The result set of a CTE can only be "materialized" after it has been executed, but we never get there ...

So your construct basically works. Test your query with EXPLAIN ANALYZE and you will see (never executed) next to the plans for CTEs that did not actually get called. The CTEs still have to be planned, which incurs a minor overhead regardless.

Also, the sort order in subqueries or CTEs is carried over to the outer SELECT as long as nothing happens there that might trigger a different sort: UNION (without ALL), JOIN, additional WHERE clause, ... But this is an implementation detail, and there are no guarantees. This addresses the order within the result set of one subquery. Similar considerations apply to the order among UNION ALL legs. Related:

Depending on how you intend to use this construct, there may be better options, including:

  • Just switch to the right statement in your client code.

  • Use a PL/pgSQL function and switch with CASE inside. Example:

CREATE OR REPLACE FUNCTION my_func(_sorting_param int = 0)
  RETURNS TABLE (id int)  -- matching column definition list
  LANGUAGE plpgsql STRICT AS
$func$
BEGIN
   CASE _sorting_param
   WHEN 1 THEN  -- posts_by_order1
      RETURN QUERY
      SELECT p.id
      FROM   post p
      ORDER  BY [some complex ordering 1]
      LIMIT  10
   WHEN 2 THEN  -- posts_by_order2
      ...
   ELSE  -- ?
      ...
   END CASE;
END
$func$;
  • Use dynamic SQL in a PL/pgSQL function. Example:
CREATE OR REPLACE FUNCTION my_func(_sorting_param int = 0)
  RETURNS TABLE (id int)  -- matching column definition list!
  LANGUAGE plpgsql STRICT AS
$func$
BEGIN
   RETURN QUERY EXECUTE format(
   $q$
   SELECT p.id
   FROM   post p
   ORDER  BY %s
   LIMIT  10
   $q$
 , CASE _sorting_param
   WHEN 1 THEN  ' ... '  -- posts_by_order1
   WHEN 2 THEN  ' ... '  -- posts_by_order2
   ELSE true
   END
   );
END
$func$;

Call either with:

SELECT * FROM my_func(0);

Take care not to introduce SQL-injection with dynamic SQL. See:

Erwin Brandstetter
  • 185,527
  • 28
  • 463
  • 633