0

Based on AdventureWorks2022, I'm trying to use CONTAINS alongside an OR operator:

DECLARE @textToSearch NVARCHAR(500) = '';

SELECT Description FROM SalesLT.ProductDescription WHERE /Condition1/ LEN(COALESCE(@textToSearch, '')) > 0 AND /Condition2/ CONTAINS(Description, @textToSearch);

It is expected the Condition2 not to calculated if Condition1 is false, but strangly I got this error:

Null or empty full-text predicate.

This means CONTAINS ran with an empty parameter! And the strange thing is that if you changed the Condition1 to 1=0 (that obviously results in false), everything is OK and CONTAINS never will run.

The questions are:

  1. Why the operator precedece not works?!
  2. Is it possible to apply CONTAINS if and only if @textToSearch has a non-empty value? How?

NOTE: I don't want to split the query into two queries

MSDN references:

Using CONTAINS with a logical operator (AND)

Operator Precedence (Transact-SQL)

ABS
  • 157
  • 2
  • 14

3 Answers3

2

SQL is a declarative language. You describe what you want, not how you want. Thus, boolean short circuit might or might not be available. The query optimizer is allowed great leniency how to work out the how part. This includes changing the evaluation order of logical operators, and is documented in the remarks section of search conditions section in the docs.

TL;DR:

The order of evaluation of logical operators can vary depending on choices made by the query optimizer.

SQL Server isn't unique with this respect, PostgreSQL behaves the same way:

The order of evaluation of subexpressions is not defined. In particular, the inputs of an operator or function are not necessarily evaluated left-to-right or in any other fixed order. Furthermore, if the result of an expression can be determined by evaluating only some parts of it, then other subexpressions might not be evaluated at all. For instance, if one wrote:

SELECT true OR somefunc();

then somefunc() would (probably) not be called at all.

Also, see earlier question about the same.

vonPryz
  • 3,169
  • 19
  • 19
1

1=0 is a constant expression. Since, it is FALSE the expression after the AND does not matter and no records will be returned ever. SQL-Server might decide to not execute the SELECT command at all and simply return an empty recordset.

On the other side LEN(COALESCE(@textToSearch, '')) > 0 is not a constant expression and must therefore be evaluated and along with it possibly the expression after the AND.

  1. As @vonPryz explains, SQL-Server might use short circuit evaluation or it might not and the order of evauation is undefined.

  2. NO. Therefore, use a CASE expression. The linked documentation says:

    The CASE expression evaluates its conditions sequentially and stops with the first condition whose condition is satisfied. [...] You should only depend on order of evaluation of the WHEN conditions for scalar expressions (including non-correlated subqueries that return scalars), not for aggregate expressions.

    SELECT Description
    FROM SalesLT.ProductDescription
    WHERE
        CASE WHEN LEN(COALESCE(@textToSearch, '')) = 0 THEN 0
             WHEN CONTAINS(Description, @textToSearch) THEN 1
             ELSE 0
        END = 1
    
0

While @vonPryz' answer is correct insofar as the sequencing (i.e. the evaluation-order) of predicate terms in a WHERE clause is not within the query author's control, it is still possible to use other T-SQL language features to ensure safety and correctness.

But I do note that your particular issue only happens with a statically defined T-SQL variable - which means it's your responsibility to add an IF guard to your entire SELECT query:

Or rather: Don't execute a query that will fail when your @variables are invalid, like so:

DECLARE @textToSearch NVARCHAR(500) = N'';

IF( @textToSearch IS NOT NULL AND LEN( @textToSearch ) > 0 ) BEGIN;

SELECT
    pd."Description"
FROM
    SalesLT.ProductDescription AS pd
WHERE
    LEN( COALESCE( @textToSearch, N'' ) ) > 0 /*Condition1*/
    AND
    CONTAINS( pd."Description", @textToSearch ); /*Condition2*/ 

END; /* Optionally, you may want to execute a query that always returns zero rows but with the same column-schema (names, datatypes, nullability, etc) to ensure any consumers won't break due to assumptions about resultset existence or ordering: ...like so: */ ELSE BEGIN

SELECT
    N'' AS "Description"
FROM
    ( VALUES (1) ) AS dummy( a )
WHERE
    dummy.a = 0;

END;


But if your arguments to CONTAINS are not variables, but just column-expressions inside a SELECT, then the solutions (or workarounds) are different.

Supposing you have this as your current query, which will fail similarly to your current query when otherTable.KnownSubstring is not NULL but still empty because the CONTAINS function requires nonempty arguments:

SELECT
    pd.Description
FROM
    SalesLT.ProductDescription AS pd
    INNER JOIN otherTable AS ot ON pd.Foo = ot.Bar
WHERE
    LEN( COALESCE( ot.KnownSubstring, N'' ) ) > 0 /*Condition1*/
    AND
    CONTAINS( pd.Description, ot.KnownSubstring ); /*Condition2*/ 

...in which case use a derived table (inner subquery) or CTE to initially filter out invalid rows:

WITH validJoinedRows AS (
SELECT
    pd.*
    ot.*
FROM
    SalesLT.ProductDescription AS pd
    INNER JOIN otherTable AS ot ON pd.Foo = ot.Bar
WHERE
    ot.KnownSubstring IS NOT NULL
    AND
    LEN( ot.KnownSubstring ) > 0
    AND
    pd.Description IS NOT NULL

) SELECT vj."Description" FROM validJoinedRows AS vj WHERE CONTAINS( vj.Description, vj.KnownSubstring );

Dai
  • 632
  • 4
  • 20