3

Analyzing a simple query, I noticed that the REPLACE function increases the estimated row size. Look at the following query, executed on AdventureWorks:

select p.BusinessEntityID, REPLACE(p.FirstName, 'a', 'b') as X
from Person.Person p

The following is the execution plan. The estimated row size, starting from 65 B, goes up to 4015 B only applying the Compute Scalar component that is related to the REPLACE function. Can anyone give an explanation about it?

The test has been done on Sql Server 2022.

Thanks in advance.

enter image description here

1 Answers1

7
SELECT TOP (1) 
    BaseType = SQL_VARIANT_PROPERTY(REPLACE(P.FirstName, 'a', 'b'), 'BaseType'),
    MaxLen = SQL_VARIANT_PROPERTY(REPLACE(P.FirstName, 'a', 'b'), 'MaxLength')
FROM Person.Person AS P;
BaseType MaxLen
nvarchar 8000

You can in general replace shorter substrings with longer ones. SQL Server appears not to bother trying to determine the exact maximum length of the result given the length of the substrings provided, the number of expected matches, and any difference in byte count even if the number of characters is the same.

For a non-max input, the return type is limited to 8000 bytes as documented. As usual, the estimate for a variable-length column is 50% of the maximum length.

You can of course explicitly CAST or CONVERT the REPLACE result if this causes problems for you. As a side note, it is generally best to match data types so the literals in your example should have an N prefix as FirstName is nvarchar(50).

SELECT CONVERT(nvarchar(50), REPLACE(P.FirstName, N'a', N'b'))
FROM Person.Person AS P;

fixed

Paul White
  • 94,921
  • 30
  • 437
  • 687