6

When SQL Server builds sampled statistics, how is the sampling percentage calculated?

For example, updating statistics on the PostId column of the dbo.Votes table in the Stack Overflow 2013 database:

UPDATE STATISTICS dbo.Votes
(
    _WA_Sys_00000002_0AD2A005
) 
WITH MAXDOP = 1;

SQL Server builds an internal statement:

SELECT 
    StatMan
    (
        [SC0], 
        [SB0000]
    ) 
    FROM 
    (
        SELECT TOP 100 PERCENT 
            [SC0], 
            step_direction([SC0]) over (order by NULL) AS [SB0000]  
        FROM 
        (
            SELECT [PostId] AS [SC0] 
            FROM [dbo].[Votes] 
                TABLESAMPLE SYSTEM (9.234204e-01 PERCENT) 
                WITH (READUNCOMMITTED) 
        ) AS _MS_UPDSTATS_TBL_HELPER 
        ORDER BY 
            [SC0], 
            [SB0000] 
    ) AS _MS_UPDSTATS_TBL  
OPTION (MAXDOP 1)

Where does the 9.234204e-01 PERCENT come from?

Paul White
  • 94,921
  • 30
  • 437
  • 687

1 Answers1

5

The sampling percentage is derived from the number of rows in the table and the number of leaf level pages in the index chosen for the statistics collection. This 'index' will most often be the base table (clustered index or heap).

These two input parameters can be obtained from a DMV:

DECLARE 
    @NumRows float,
    @NumPages float;

SELECT @NumRows = CONVERT(float, SUM(IPS.record_count)), @NumPages = CONVERT(float, SUM(IPS.page_count)) FROM sys.dm_db_index_physical_stats ( DB_ID(), OBJECT_ID(N'dbo.Votes', 'U'), 1, NULL, N'DETAILED' ) AS IPS WHERE IPS.index_level = 0;

The calculation is:

DECLARE
    @MinPages float = 1.024e+003,
    @HundredPercent float = 1e+002,
    @Magic1 float = 1.5e+001,
    @Magic2 float = 5.5e-001;

SELECT SamplePercentage = IIF ( @NumPages < @MinPages, @HundredPercent, @HundredPercent * LEAST ( @MinPages + ROUND ( @NumPages * ( @Magic1 * ROUND ( POWER ( @NumRows, @Magic2 ), 0, 1 ) ) / @NumRows, 0, 1 ), @NumPages ) / @NumPages );

This returns:

SamplePercentage
0.923420362201546

Which matches the 9.234204e-01 PERCENT given in the question.

Paul White
  • 94,921
  • 30
  • 437
  • 687