2

I was considering using incremental statistics in my data warehouse but I read an article by Erin Stellato that says the query optimizer doesn't use incremental statistics. The article was written in May of 2015 but I haven't seen anything recanting her position in the following 6 years. There are a number of articles in different communities that show how to set it up, but if it's not useful, why bother? Does anyone know if subsequent versions of the query optimizer in 2016, 2017, or 2019 support the use of incremental statistics? If not, should we even use them? If they won't help the engine make a good decision about how to query a table with 10 billion records what good is it? Thanx for any help!

J. Mini
  • 1,161
  • 8
  • 32
Mike Lemay
  • 57
  • 2

2 Answers2

7

Presumably your older partitions are not changing as frequently (or at all).

You bother because with incremental, when you update stats, you only update the current/active partition(s), which reduces the time you spend updating statistics (Erin talks about this in a follow-up article here). Yes, those get squished and folded into the larger histogram, and yes, I too had hoped the optimizer would be using them by now. But alas, it does not.

If you didn't use incremental, you're updating stats for the whole 10 billion row table (which takes more time) and what have you gained? Roughly the same histogram.

They're more useful for partition elimination (which I hope is useful for your 10 billion row table!) than for cardinality within, say, the latest partition.

If you want to take advantage of more histogram steps inside an individual partition, you could also maintain filtered statistics, maybe only on your most actively queried partitions. (These are useful for non-partitioned tables, too, where 200/201 steps is just not enough.)

From the CREATE STATISTICS documentation:

WHERE <filter_predicate> Specifies an expression for selecting a subset of rows to include when creating the statistics object. Statistics that are created with a filter predicate are called filtered statistics. The filter predicate uses simple comparison logic and cannot reference a computed column, a UDT column, a spatial data type column, or a hierarchyID data type column. Comparisons using NULL literals are not allowed with the comparison operators. Use the IS NULL and IS NOT NULL operators instead.

Here are some examples of filter predicates for the Production.BillOfMaterials table:

  • WHERE StartDate > '20000101' AND EndDate <= '20000630'
    
  • WHERE ComponentID IN (533, 324, 753)
    
  • WHERE StartDate IN ('20000404', '20000905') AND EndDate IS NOT NULL
    

Some info on filtered stats:

Aaron Bertrand
  • 181,950
  • 28
  • 405
  • 624
2

Aaron Bertrand's fine answer tells you why you would still use incremental statistics and how to get what you wanted from them, but it doesn't give proof answering your headline question: Does the Query Optimizer use Incremental Statistics yet?

The Answer

As of the SQL Server 2025 preview, the query optimiser still does not use incremental statistics.

The below demo shows you how to spin up a Docker environment with different versions of SQL Server. That should make this answer evergreen. As SQL Server 2025 is still in preview, you should try this again after a few months. I got the same results on SQL Server 2022.

The Proof

Premise

I couldn't replicate Simon Liew's demo, but WideWorldImporters is a much smaller database and has partitioned tables out of the box so it's easier to work with than Erin's excellent example. I have therefore followed Simon's footsteps and used WideWorldImporters. Specifically, I have used its Sales.CustomerTransactions table. It is partitioned by TransactionDate. Since I want to try this on SQL Server 2025 as it matures, I have started by installing the Docker container. Today's date is the 1st of June 2025.

Make An Instance

I have assumed that you already have Docker installed and you are familiar with it. You should run this on your scrap laptop, not your prod server.

docker pull mcr.microsoft.com/mssql/server:2025-latest

docker run -e "ACCEPT_EULA=Y"
-e "MSSQL_SA_PASSWORD=YourPasswordMake~TSp3cial"
-p 1433:1433
--name sql2025P1 --hostname sql2025P1
-d mcr.microsoft.com/mssql/server:2025-latest

docker start sql2025P1

Download WideWorldImporters

It's at this link. I used the OLTP one.

Copy WideWorldImporters to Container

docker cp '/whatever/Downloads/WideWorldImporters-Full.bak' sql2025P1:/var/opt/mssql/data/

Connect

Crack open your IDE and connect to your instance. Try ifconfig if you can't get its IP. Its on port 1433. Its name is sql2025P1 and you set the sa password earlier, so login as sa.

Restore WideWorldImporters

I stole this from GitHub.

USE [master];
GO
RESTORE DATABASE [WideWorldImporters]
FROM DISK = '/var/opt/mssql/data/WideWorldImporters-Full.bak'
WITH
    MOVE 'WWI_Primary' TO '/var/opt/mssql/data/WideWorldImporters.mdf',
    MOVE 'WWI_UserData' TO '/var/opt/mssql/data/WideWorldImporters_UserData.ndf',
    MOVE 'WWI_Log' TO '/var/opt/mssql/data/WideWorldImporters.ldf',
    MOVE 'WWI_InMemory_Data_1' TO '/var/opt/mssql/data/WideWorldImporters_InMemory_Data_1',
    FILE = 1,
    NOUNLOAD,
    STATS = 5;
GO

Change Compatibility Level

Use whatever is the newest. 170 is SQL Server 2025.

ALTER DATABASE [WideWorldImporters] SET COMPATIBILITY_LEVEL = 170

USE [WideWorldImporters];

Enable Incremental Statistics

Now we take a leaf from Erin's book and get some incremental stats

ALTER INDEX [CX_Sales_CustomerTransactions] ON Sales.CustomerTransactions
  REBUILD WITH (STATISTICS_INCREMENTAL = ON);
GO

UPDATE STATISTICS Sales.CustomerTransactions WITH FULLSCAN;

Check Incremental Stats

This confirms that we have incremental stats.

SELECT stats_id, rows, steps, node_id, left_boundary, right_boundary, partition_number
FROM [sys].[dm_db_stats_properties_internal](OBJECT_ID('Sales.CustomerTransactions'),1)
ORDER BY [node_id];

dm_db_stats_properties_internal

Which Estimate Do We Get?

This query's plan estimates 80 rows and hits partition 1. Check the plan for proof.

DBCC FREEPROCCACHE;
GO

SET STATISTICS XML ON; SELECT * FROM Sales.CustomerTransactions WHERE TransactionDate = '20130108'; SET STATISTICS XML OFF;

Stats from the plan

DBCC SHOW_STATISTICS confirms this.

DBCC SHOW_STATISTICS('Sales.CustomerTransactions','CX_Sales_CustomerTransactions');

SHOW_STATISTICS

But this DBCC SHOW_STATISTICS is for regular, non-incremental stats! So what do our incremental stats say?

We steal from Erin again. Note that we pass in 2 as our third argument, because we want to see partition 1.

DBCC TRACEON(2309);
GO
DBCC SHOW_STATISTICS('Sales.CustomerTransactions','CX_Sales_CustomerTransactions', 2);

SHOW_STATISTICS partitioned

It says that we should have predicted 71 rows. Thus, we can see that our incremental statistics were not used.

Cleanup

Stop the container.

docker stop sql2025P1

Delete the container.

docker rm sql2025P1

To remove the image, see here.

J. Mini
  • 1,161
  • 8
  • 32