1

Today i asked question about bad query performance. That query had a lot of joins over a lot of views. I think i have found which view is a trouble for SQL server. Here is actuall query plan for select run on that view https://www.brentozar.com/pastetheplan/?id=BJB4WOV6S As you can see SQL Server highly overestimates number of rows and needed memory. Outter apply is used to run TVF that is returning processed XML, xmls are rather very small. I wonder how to avoid such overestimation(and what cuase them)? Should i create xml indexes for my table?

One of the TVF FUNCTION [schema].[functionname](@xmlData XML) RETURNS TABLE AS return SELECT Tbl.Col.value('PN[1]', 'nvarchar(32)') as PPN FROM @xmlData.nodes('PI') Tbl(Col) GO example declare @para as xml set @para = '<PI><PGID>00000000000000000000000000000000</PGID><PN>0</PN></PI>' Select * from [schema].[functionname](@para)

whd
  • 61
  • 1
  • 1
  • 9

2 Answers2

2

I think the base cardinality estimate for XML TVF is 10,000 rows - I read it in a presentation by Michael Rys

To improve estimates, you can either add a schema to your XML or create an XML index.

Other than that I could only direct you to row goal hints and estimates trick I first heard from Adam Machanic here

Zikato
  • 5,619
  • 1
  • 17
  • 34
2

The first part here is removing the high estimates on the TVF.

Test data:

declare @para as xml 
set @para = '<PI><PGID>00000000000000000000000000000000</PGID><PN>0</PN></PI>'

CREATE TABLE dbo.Test(id int identity(1,1) primary key not null, val int, xmldata xml)

INSERT INTO dbo.Test WITH(TABLOCK)(val,xmldata)
SELECT TOP(400000) 
ROW_NUMBER() OVER(ORDER BY(SELECT NULL)),
@para
FROM master..spt_values spt1
CROSS APPLY master..spt_values spt2

Primary Index used

CREATE PRIMARY XML INDEX Test_xmlData  
    ON dbo.Test(xmldata);  

Query rewrite #1:

SELECT t.*, Tbl.Col.value('PN[1]', 'nvarchar(32)') as PPN
FROM dbo.Test t
OUTER APPLY t.xmldata.nodes('PI') as Tbl(Col);

This first part removes the high estimates, execution plans found here:

Original plan without the xml index, high estimates

Plan with the xml index, lower estimates

Query rewrite #2:

We could simplify the query to this:

SELECT t.*, xmldata.value('(PI/PN)[1]', 'nvarchar(32)') as PPN
FROM dbo.Test t;

Index #2, selective index:

A selective index could then be created:

CREATE SELECTIVE XML INDEX sxi_index
ON dbo.Test(xmldata)
FOR (
pathPIPN= 'PI/PN' AS SQL NVARCHAR(32) 
);

With the final execution statistics:

 SQL Server Execution Times:
   CPU time = 2109 ms,  elapsed time = 11066 ms.

Cpu time is ~ 3 times lower compared to the query stats of the original query:

SELECT *
FROM dbo.Test 
OUTER APPLY [dbo].[functionname](xmldata);

 SQL Server Execution Times:
   CPU time = 6000 ms,  elapsed time = 10196 ms.

And the query plan for the simplified query & selective index:

enter image description here

With the estimates matching the actual values.

Query plan

The selective index could also be used by the outer apply solution:

SELECT t.*, Tbl.Col.value('PN[1]', 'nvarchar(32)') as PPN
FROM dbo.Test t
OUTER APPLY t.xmldata.nodes('PI') as Tbl(Col);

In that case, the cpu time is at 3.5seconds:

CPU time = 3594 ms,  elapsed time = 10979 ms.

on my instance.

Randi Vertongen
  • 16,593
  • 4
  • 36
  • 64