2

I'm trying to ignore duplicates between two providers in my tables while using join.

I'm using this query for now:

SELECT *
FROM A AS a1, A AS a2
WHERE a1.provider <> a2.provider AND a1.etype = a2.etype

Unfortunately, that query gives me duplicates (two rows of the same providers, in opposite sides).

I'm trying to ignore rows that appear twice in the opposite sides, for example:

etype    provider    etype    provider
hello    aaa         hello    bbb
hello    bbb         hello    aaa

This dbfiddle illustrates the desired output as well.

Paul White
  • 94,921
  • 30
  • 437
  • 687
Omer
  • 31
  • 2

2 Answers2

2

What Kondybas explained about the appropriate operator, but with legal syntax:

SELECT etype, a1.provider AS provider1, a2.provider AS provider2
FROM   A a1
JOIN   A a2 USING (etype)
WHERE  a1.provider < a2.provider;

[INNER] JOIN requires a join condition, with either ON or USING.

To keep events with only a single provider in the result, use a LEFT [OUTER] JOIN:

SELECT a1.etype, a1.provider AS provider1, a2.provider AS provider2
FROM   A a1
LEFT   JOIN A a2 ON a1.etype = a2.etype
                AND a1.provider < a2.provider;

The second condition has to move to the join condition as well in this case. See:

Shog9
  • 101
  • 2
  • 9
Erwin Brandstetter
  • 185,527
  • 28
  • 463
  • 633
1

You have to replace the <> comparison by < or >. Otherwise the plain CROSS JOIN produce the full cartesian product with main diagonal excluded.

Now you get the next:

-- AB AC AD
BA -- BC BD
CA CB -- CD
DA DB DC --

instead of :

-- AB AC AD
-- -- BC BD
-- -- -- CD
-- -- -- --

So your query should look like that:

SELECT *
  FROM       A AS a1
  CROSS JOIN A AS a2  -- my strong advise to use JOIN instead of ',' for readability
 WHERE a1.provider < a2.provider -- here is the comparison I've mean
   AND a1.etype    = a2.etype
;

N.B.
Proper code formatting can improve the comprehension tremendously.

Kondybas
  • 4,800
  • 19
  • 16