18

Is a postgres way of combining IS DISTINCT FROM with ANY or some other neat way of getting the same result?

select count(*)
from (select 'A' foo union all select 'Z' union all select null) z
where foo <> any(array[null, 'A']);

 count
-------
     1
(1 row)

select count(*)
from (select 'A' foo union all select 'Z' union all select null) z
where foo is distinct from any(array[null, 'A']);  

ERROR:  syntax error at or near "any"
LINE 3: where foo is distinct from any(array[null, 'A']);
                                   ^
Erwin Brandstetter
  • 185,527
  • 28
  • 463
  • 633
Jack Douglas
  • 40,517
  • 16
  • 106
  • 178

3 Answers3

21

Looking at it as a grammar problem, ANY is defined as (in Row and Array Comparisons):

expression operator ANY (array expression)

But is distinct from is not an operator, it's a "construct" as we're told in Comparison Operators:

When this behavior is not suitable, use the IS [ NOT ] DISTINCT FROM constructs

Since PostgreSQL has user-defined operators, we may define an operator/function combo for this purpose:

create function is_distinct_from(text, text) returns bool as 
'select $1 is distinct from $2;' language sql;

create operator <!> (
 procedure=is_distinct_from(text,text),
 leftarg=text, rightarg=text
);

Then it can precede ANY:

select count(*)
from (select 'A' foo union all select 'Z' union all select null) z
where foo <!> any(array[null, 'A']);  
 count 
-------
     3
(1 row)
Daniel Vérité
  • 32,662
  • 3
  • 78
  • 84
18

Operator

This is building on @Daniel's clever operator.
While being at it, create the function/operator combo using polymorphic types. Then it works for any type - just like the construct.
And make the function IMMUTABLE.
And PARALLEL SAFE (Postgres 9.6+).
Using current syntax for CREATE OPERATOR (Postgres 12+).

CREATE FUNCTION is_distinct_from(anyelement, anyelement)
  RETURNS bool
  LANGUAGE sql IMMUTABLE PARALLEL SAFE COST 5 AS
'SELECT $1 IS DISTINCT FROM $2';

CREATE OPERATOR &lt;!&gt; ( FUNCTION = is_distinct_from(anyelement, anyelement) , LEFTARG = anyelement , RIGHTARG = anyelement );

The operator <!> does not seem to be in use in any Postgres module.

If you are going to use this operator a lot, you might flesh it out some more to assist the query planner. For starters, add the COMMUTATOR and NEGATOR clauses to assist the query optimizer. Replace CREATE OPERATOR from above with this:

CREATE OPERATOR <!> (
  FUNCTION = is_distinct_from(anyelement, anyelement)
, LEFTARG  = anyelement
, RIGHTARG = anyelement
, COMMUTATOR = <!>
, NEGATOR = =!=
);

And add:

CREATE FUNCTION is_not_distinct_from(anyelement, anyelement)
  RETURNS bool
  LANGUAGE sql IMMUTABLE PARALLEL SAFE COST 5 AS
'SELECT $1 IS NOT DISTINCT FROM $2';

CREATE OPERATOR =!= ( FUNCTION = is_not_distinct_from(anyelement, anyelement) , LEFTARG = anyelement , RIGHTARG = anyelement , COMMUTATOR = =!= , NEGATOR = <!> );

But the additional clauses won't help with the case at hand and plain indexes still won't be used. It's more sophisticated to achieve that. (I haven't tried.) Read the chapter "Operator Optimization Information" in the manual for details.

Test case

The test case in the question can only succeed if all values in the array are identical. For the array in the question ('{null,A}'::text[]) the result is always true. Is that intended? I added another test for "IS DISTINCT FROM ALL":

SELECT foo
     , foo <!> ANY ('{null,A}'::text[]) AS chk_any
     , foo <!> ALL ('{null,A}'::text[]) AS chk_all
FROM (
   VALUES ('A'),('Z'),(NULL)
   ) z(foo);

foo | chk_any | chk_all -----+---------+--------- A | t | f Z | t | t | t | f

Alternative with standard operators

foo IS DISTINCT FROM ANY (test_arr) (illegal syntax) can almost be translated to:

foo = ALL (test_arr) IS NOT TRUE

foo = ALL (test_arr) yields ...

... true if all elements are foo
... false if any NOT NULL element is <> foo
... null if at least one element IS NULL and no element is <> foo

So, the remaining corner case is where ...

  • foo IS NULL
  • and test_arr consists of nothing but NULL elements.

If either one can be ruled out, we are done. Therefore, use the simple test if ...

  • the column is defined NOT NULL.
  • or you know the array is never all NULLs.

Else, test additionally:

AND ('A' = ALL(test_arr) IS NOT NULL OR 
     'B' = ALL(test_arr) IS NOT NULL OR
     foo IS NOT NULL)

Where 'A' and 'B' can be any distinct values. Explanation and alternatives, see:

Again, if you know about any value that cannot exist in test_arr, for instance the empty string '', you can still simplify:

AND ('' = ALL(test_arr) IS NOT NULL OR foo IS NOT NULL)

Result matrix for all combinations:

SELECT foo, test_arr
     , foo = ALL(test_arr) IS NOT TRUE  AS test_simple
     , foo = ALL(test_arr) IS NOT TRUE
       AND ('A' = ALL(test_arr) IS NOT NULL OR
            'B' = ALL(test_arr) IS NOT NULL OR 
            foo IS NOT NULL)            AS test_sure 
FROM (
   VALUES ('A'),('Z'),(NULL)
   ) v(foo)
CROSS JOIN (
   VALUES ('{null,A}'::text[]),('{A,A}'),('{null,null}')
   ) t(test_arr)

foo | test_arr | test_simple | test_sure -----+-------------+-------------+----------- A | {NULL,A} | t | t A | {A,A} | f | f -- only TRUE case A | {NULL,NULL} | t | t Z | {NULL,A} | t | t Z | {A,A} | t | t Z | {NULL,NULL} | t | t | {NULL,A} | t | t | {A,A} | t | t | {NULL,NULL} | t | f -- special case

This is a bit more verbose than Andriy's EXCEPT solution, but it is substantially faster.

Erwin Brandstetter
  • 185,527
  • 28
  • 463
  • 633
8

Perhaps like this:

select foo
     , exists (values (null), ('A') except select foo) chk_any
     , not exists (values (null), ('A') intersect select foo) chk_all
from ( values ('A'),('Z'),(null) ) z(foo);

 foo | chk_any | chk_all
-----+---------+---------
 A   | t       | f
 Z   | t       | t
     | t       | f

Note that not only the null in the "array" but also the null in z is being compared this way.

Jack Douglas
  • 40,517
  • 16
  • 106
  • 178
Andriy M
  • 23,261
  • 6
  • 60
  • 103