25

I'm trying to migrate a query from Oracle to SQL Server 2014.

Here is my query which works great in Oracle:

select
count(distinct A) over (partition by B) / count(*) over() as A_B
from MyTable 

Here is the error i got after tried to run this query in SQL Server 2014.

Use of DISTINCT is not allowed with the OVER clause

Anyone know what is the problem? Is such as kind of query possible in SQL Server? Please advise.

Omri
  • 383
  • 1
  • 5
  • 17

5 Answers5

23

This gives the distinct count(*) for A partitioned by B:

dense_rank() over (partition by B order by A) 
+ dense_rank() over (partition by B order by A desc) 
- 1
Ben
  • 239
  • 2
  • 2
15

Anyone know what is the problem? Is such as kind of query possible in SQL Server?

No it isn't currently implemented. See the following connect item request.

OVER clause enhancement request - DISTINCT clause for aggregate functions

Another possible variant would be

SELECT M.A,
       M.B,
       T.A_B
FROM   MyTable M
       JOIN (SELECT CAST(COUNT(DISTINCT A) AS NUMERIC(18,8)) / SUM(COUNT(*)) OVER() AS A_B,
                    B
             FROM   MyTable
             GROUP  BY B) T
         ON EXISTS (SELECT M.B INTERSECT SELECT T.B) 

the cast to NUMERIC is there to avoid integer division. The reason for the join clause is explained here.

It can be replaced with ON M.B = T.B OR (M.B IS NULL AND T.B IS NULL) if preferred (or simply ON M.B = T.B if the B column is not nullable).

jmoreno
  • 1,097
  • 1
  • 7
  • 24
Martin Smith
  • 87,941
  • 15
  • 255
  • 354
9

You can take the max value of dense_rank() to get the distinct count of A partitioned by B.

To take care of the case where A can have null values you can use first_value to figure out if a null is present in the partition or not and then subtract 1 if it is as suggested by Martin Smith in the comment.

select (max(T.DenseRankA) over(partition by T.B) - 
          cast(iif(T.FirstA is null, 1, 0) as numeric(18, 8))) / T.TotalCount as A_B
from (
     select dense_rank() over(partition by T.B order by T.A) DenseRankA,
            first_value(T.A) over(partition by T.B order by T.A) as FirstA,
            count(*) over() as TotalCount,
            T.A,
            T.B
     from MyTable as T
     ) as T
Mikael Eriksson
  • 22,295
  • 5
  • 63
  • 106
5

Try doing a subquery, grouping by A, B, and including the count. Then in your outer query, your count(distinct) becomes a regular count, and your count(*) becomes a sum(cnt).

select
count(A) over (partition by B) * 1.0 / 
    sum(cnt) over() as A_B
from
(select A, B, count(*) as cnt
 from MyTable
 group by A, B) as partial;
Rob Farley
  • 16,324
  • 2
  • 39
  • 61
0

SQL Server for now does not allow using Distinct with windowed functions.

But once you remember how windowed functions work (in simplistic terms: they're applied to result set of the query), you can work around that:

select B,
min(count(distinct A)) over (partition by B) / max(count(*)) over() as A_B
from MyTable
group by B
AcePL
  • 349
  • 1
  • 4
  • 15