2

Query 1 and Query 2 share the same semantics, both involving the combination of the NOT and BETWEEN operators, with the evaluation of the BETWEEN expression being FALSE in both cases. However, Query 1 unexpectedly retrieves more null values than Query 2. Could this be a logical error?

DROP TABLE tsqsdb0_t0;

CREATE TABLE tsqsdb0_t0( time TIMESTAMP PRIMARY KEY, c0 INTEGER ) USING TIMESERIES ;

INSERT OR REPLACE INTO tsqsdb0_t0(time, c0) VALUES (TIMESTAMP('2022-01-01T16:00:00Z'), 0), (TIMESTAMP('2022-01-01T16:00:05Z'), null);


Query 1

SELECT c0, 
       time 
FROM tsqsdb0_t0 
WHERE NOT ( 2 BETWEEN c0 AND 1);

Query 2

SELECT c0, 
       time 
FROM tsqsdb0_t0 
WHERE NOT ( -1 BETWEEN c0 AND 1);
Martin Smith
  • 87,941
  • 15
  • 255
  • 354
Alice
  • 163
  • 2

3 Answers3

3

Could this be a logical error?

No - it is entirely logically correct and is a consequence of the fact that BETWEEN combines two predicates.

test_expression BETWEEN begin_expression AND end_expression  

is equivalent to

test_expression  >= begin_expression AND test_expression  <= end_expression

So for Query 1

2 BETWEEN c0 AND 1

is equivalent to

2  >= c0  AND 2  <= 1

This is equivalent to

2  >= c0  AND False

If an AND-ed boolean expression contains a False then the whole expression simplifies to False in three valued logic too so this is equivalent to

False

And so negating that is

True

For Query 2 we end up with

-1  >= c0  AND -1  <= 1

This is equivalent to

-1  >= c0  AND True

Which simplifies to just

-1  >= c0

So this expression evaluates to

  • true when c0 <= -1
  • false when c0 > -1
  • unknown when c0 is null

Negating unknown still yields unknown which is why the null is not returned by query 2.

Martin Smith
  • 87,941
  • 15
  • 255
  • 354
2

I wouldn't call it a logical error; most likely it's a side effect of query optimisation.

2 BETWEEN c0 AND 1 is always false (nothing can be simultaneously greater than or equal 2 and less than or equal 1), therefore NOT (2 BETWEEN c0 AND 1) is always true, a smart optimiser would figure that out and wouldn't even bother evaluating the search criteria, simply returning all rows in the table.

-1 BETWEEN c0 AND 1, on the other hand, could be true, so the criterion will be evaluated, excluding the NULL value, for which the expression -1 BETWEEN c0 AND 1 evaluates to NULL and therefore NOT (-1 BETWEEN c0 AND 1) is not true.

Unexpected behaviour like this is one of the reasons some people hate NULLs.

mustaccio
  • 28,207
  • 24
  • 60
  • 76
2

General form

Your predicates are of the form:

NOT (x BETWEEN c0 AND 1)

Where x is either 2 or -1 in the example queries.

Expanding BETWEEN and simplifying using De Morgan's laws:

NOT (c0 <= x AND x <= 1)

-- De Morgan: not (A and B) = (not A) or (not B) = NOT (c0 <= x) OR NOT (x <= 1)

= c0 > x OR x > 1

Query 1

When x is 2:

c0 > 2 OR 2 > 1
= c0 > 2 OR true
-- anything OR true = true
= true

Query 1 returns all rows (even where c0 is null) because the predicate is always true.

Query 2

When x is -1:

c0 > -1 OR -1 > 1
= c0 > -1 OR false
= c0 > -1

Query 2 returns rows where c0 > -1.
It excludes rows where c0 is null because the predicate evaluates to unknown.
Rows are only returned if the predicate returns true.

Paul White
  • 94,921
  • 30
  • 437
  • 687