Order and nature of columns in select list of sorted UNION query affects performance

Question

I am observing significant performance impact of select lists in sorted UNION queries.

The general form of UNION queries I am working with is:

SELECT * FROM (
  SELECT <select_list> FROM <table>
  UNION ALL
  ...
) q
ORDER BY <column>

The outer select is used as it significantly improves performance in the presence of ORDER BY, but that is outside of the scope of this question. UNION ALL is always used, rather than UNION.

I will refer to performance as either "fast" (instantaneous) or "slow" (5 seconds or longer).

Performance was measured by running queries in the DBeaver console, which by default retrieves only the first page, whose size is 200.

Changing the select list affects performance in the following ways:

When <select_list> = <column> (i.e., only the sorting column is selected), queries are fast. The execution plan shows index scans for the sorting column and merge joins for concatenation.
When the select list includes other columns in addition to the sorting column, queries may be fast or slow. It has been observed that when the following conditions are all true, queries are fast:
- Select list includes the clustered index column.
- Select list starts with the clustered index column, or with the sorting column immediately followed by the clustered index column.

Observations were made using Microsoft SQL Server 2019 (RTM-CU26) (KB5035123) - 15.0.4365.2 (X64).

I could not find anything about the effects of select lists on the performance of UNION in the SQL Server documentation.

A simplified description of the environment in which experiments were conducted, and the queries themselves are given below.

CREATE TABLE AUDIT1 (
    ID bigint NOT NULL,
    AUDITDATE datetime2 NULL,
    [USER] bigint NULL,
  -- Implies clustered index.
    CONSTRAINT PK_AUDIT1 PRIMARY KEY (ID)
);
CREATE INDEX I_AUDIT1_AUDITDATE ON AUDIT1 (AUDITDATE);
CREATE TABLE AUDIT2 (
    ID bigint NOT NULL,
    AUDITDATE datetime2 NULL,
    [USER] bigint NULL,
  -- Implies clustered index.
    CONSTRAINT PK_AUDIT2 PRIMARY KEY (ID)
);
CREATE INDEX I_AUDIT2_AUDITDATE ON AUDIT2 (AUDITDATE);

Table AUDIT1 contains 10 million records.
Table AUDIT2 contains 1 million records.
Values of AUDITDATE in AUDIT2 are greater than those in AUDIT1.
Values of AUDITDATE are aligned with ID in an increasing sequence, i.e., ID is always increasing, and so is AUDITDATE.

Query 1: select list contains only the sorting column (fast).

SELECT * FROM (
  SELECT AUDITDATE FROM AUDIT2
  UNION ALL 
  SELECT AUDITDATE FROM AUDIT1
) q
ORDER BY AUDITDATE

Execution plan:

|--Merge Join(Concatenation)
   |--Index Scan(AUDIT2.I_AUDIT2_AUDITDATE), ORDERED BACKWARD
   |--Index Scan(AUDIT1.I_AUDIT1_AUDITDATE), ORDERED BACKWARD

Query 2: select list of size > 1, contains the sorting column, does not contain the clustered index column (slow).

SELECT * FROM (
  SELECT [USER], AUDITDATE FROM AUDIT2
  UNION ALL 
  SELECT [USER], AUDITDATE FROM AUDIT1
) q
ORDER BY AUDITDATE

Note that placing the sorting column first was observed to have no effect.

Query 3: select list of size > 1, contains the sorting column, contains the clustered index column, first column is neither the sorting column, nor the clustered index column (slow).

SELECT * FROM (
  SELECT [USER], ID, AUDITDATE FROM AUDIT2
  UNION ALL 
  SELECT [USER], ID, AUDITDATE FROM AUDIT1
) q
ORDER BY AUDITDATE

Query 4: select list of size > 1, contains the sorting column, contains the clustered index column, first column is either the sorting column, or the clustered index column (fast).

SELECT * FROM (
  SELECT ID, [USER], AUDITDATE FROM AUDIT2
  UNION ALL 
  SELECT ID, [USER], AUDITDATE FROM AUDIT1
) q
ORDER BY AUDITDATE

This query shows that if the first column is ID, the query is fast.

Execution plans for queries 2, 3, 4 are the same:

|--Parallelism(Gather Streams, ORDER BY:([Union1007] ASC))
   |--Sort(ORDER BY:([Union1007] ASC))
      |--Concatenation
         |--Parallelism(Distribute Streams, RoundRobin Partitioning)
         |--Clustered Index Scan(OBJECT:(AUDIT2.PK_AUDIT2))
         |--Clustered Index Scan(OBJECT:(AUDIT1.PK_AUDIT1))

Paul White · Accepted Answer · 2025-02-17T12:32:11.130

The important point is whether SQL Server considers an expensive sort is necessary or not.

I addressed this in my article Avoiding Sorts with Merge Join Concatenation.

Key points:

Merge Join Concatenation is a special execution mode of the regular Merge Join operator, which preserves the order of the join keys only.
There are no 'join keys' in a concatenation operation. Instead, all projected columns from the concatenation take part in the 'preserved order'.
As a starting point, merge concatenation inputs are sorted on the output projection list.
The ORDER BY clause can override the starting point to avoid sorting twice.
Any uniqueness guarantees that exist can be used by the optimizer to truncate the required list of sorted columns (except at exchanges in row mode parallel plans). The ID column plays this role in your example. No further sorting is necessary after a unique key.
The optimizer's reasoning with sort orders is neither perfect nor comprehensive. Reasonable efforts are made.

You can often see the input sort order the optimizer is after by adding OPTION (MERGE UNION) to the test query and examining the execution plan to see what the sorts on the Merge Join Concatenation inputs are doing.

For example, this reveals that query 2 wants input ordered by (AUDITDATE ASC, [USER] ASC), which the indexes cannot provide.

An index on (AUDITDATE ASC, [USER] ASC) or (AUDITDATE DESC, [USER] DESC) could provide that order.

Query 3 wants (AUDITDATE ASC, [USER] ASC, ID ASC).

Query 4 is fast because the guaranteed unique column ID is listed first. This uniqueness means this no more ordering is needed after AUDITDATE, ID. The requirement to sort on AUDITDATE comes from the ORDER BY clause. ID is needed by the merge. No further sortedness is needed by the merge because ID is unique.

The AUDITDATE, ID order can be provided by the nonclustered index (by virtue of it being non-unique, so ID is part of the key).

SELECT * FROM (
  SELECT ID, [USER], AUDITDATE FROM AUDIT2
  UNION ALL 
  SELECT ID, [USER], AUDITDATE FROM AUDIT1
) q
ORDER BY AUDITDATE
OPTION (MERGE UNION);

You can see the nonclustered index provides order on AUDITDATE, ID:

SELECT A.AUDITDATE, A.ID 
FROM dbo.AUDIT1 AS A 
ORDER BY A.AUDITDATE, A.ID;

I now see that merge join is used in all queries. Only for fast queries, however, sorting is avoided.

Yes, that's the main issue. Avoiding sorting, where that is possible, is often beneficial to performance.

You statement "the ORDER BY clause can override the starting point to avoid sorting twice." does not seem to apply to queries 2 and 3, which are slow despite the ORDER BY. It seems to depend on the order of columns in the select list.

Yes, as I said, the projection list is the starting point. The ORDER BY clause can override this if the ORDER BY is a suitable arrangement for the merge's requirements. That doesn't mean it always will. I cover this extensively with examples in the article.

Could you elaborate on the comment in your SQL snippet near ORDER BY? Why should there be a compatibility between "merge concat output order" and sorting columns?

The presentation order needs to be compatible with an ordering the merge can use to avoid a sort.

Also, if you take query 3 and replace * in the outer select by ID, [USER], AUDITDATE - it remains slow.

That is outside the merge. The following uses that order and can avoid a sort:

SELECT [USER], ID, AUDITDATE FROM (
  SELECT ID, [USER], AUDITDATE FROM AUDIT2
  UNION ALL 
  SELECT ID, [USER], AUDITDATE FROM AUDIT1
) q
ORDER BY AUDITDATE
OPTION (MERGE UNION);

Note the key lookup required may still make the query 'slow', but that's a separate issue. You may of course get a different plan with the data you have.

On the other hand, take query 4 and replace * by [USER], ID, AUDITDATE - it remains fast.

Yes, because ID with its uniqueness guarantee is still first at the merge.

Finally, adding OPTION(MERGE UNION) to query 4 makes it slow, with the plan being similar to that of query 3 but with added parallelism.

The point is: A plan without sorting is possible. The optimizer considers many alternatives and chooses the one that costs lowest. That might well still be 'slow' due to e.g. lookups as already mentioned.

A slightly more advanced example showing the tension between the requested presentation order, projected columns, merge order preservation, optimizer reasoning, cost estimation and index order:

SELECT
    Q.AUDITDATE, 
    Q.ID, 
    Q.[USER] 
FROM 
(
    SELECT 
        A2.AUDITDATE, 
        A2.ID, 
        A2.[USER] 
    FROM AUDIT2 AS A2
UNION ALL

SELECT 
    A1.AUDITDATE, 
    A1.ID, 
    A1.[USER] 
FROM AUDIT1 A1

) AS Q
WHERE
    -- Give the optimizer a reason to read the index backwards
    Q.AUDITDATE <= CONVERT(datetime2(7), '9999-12-31T23:59:59.9999999', 126)
ORDER BY
    -- Compatible with merge concat output order
    Q.AUDITDATE DESC,
    Q.ID DESC
OPTION 
(
    MERGE UNION
);

As a side note, USER is a poor choice of column name. It is a syntax error unless quoted because it is a system niladic function.

Order and nature of columns in select list of sorted UNION query affects performance

1 Answers1

Linked