SQL Data aggregation

Question

I have a problem with sql data aggregation problem.here is the sample sql.I am using oracle database.

select pm.city cityid, to_char(pm.sampledate, 'yyyy-MM') timepoint, count(distinct   me.id) mecount, count(distinct me.hotpointid) hpcount
from upme me, appmday pm
where pm.apid = me.id
          and pm.sampledate between to_date('2011-11-01', 'yyyy-MM-dd') and
          to_date('2012-1-31', 'yyyy-MM-dd')
group by pm.city, to_char(pm.sampledate, 'yyyy-MM')

above i get the count of my data.but i also i want to get the overall result like

select   to_char(pm.sampledate, 'yyyy-MM'), count(distinct me.id), count(distinct me.hotpointid)
from upme me, appmday pm
        where pm.apid = me.id
              and pm.sampledate between to_date('2011-11-01', 'yyyy-MM-dd') and
              to_date('2012-1-31', 'yyyy-MM-dd')
              group by to_char(pm.sampledate, 'yyyy-MM')

basically i want get some result from sql ,in the same sql i want to get the count from the previous result,they both show in the result in one sql. i did not want to union these two sql ,because each sql is very slow.

did someone have a idea? or it cannot be done by one sql without union?

thank you for the reply.

score 4 · Accepted Answer · answered Feb 20 '12 at 12:31

You can use analytic functions to achieve this:

select distinct pm.city cityid,
    to_char(trunc(pm.sampledate, 'MM'), 'YYYY-MM') as timepoint,
    count(distinct me.id)
        over (partition by pm.city, trunc(pm.sampledate, 'MM')) as mecount,
    count(distinct me.hotpointid)
        over (partition by pm.city, trunc(pm.sampledate, 'MM')) as hpcount,
    count(distinct me.id)
        over (partition by trunc(pm.sampledate, 'MM')) as totmecount,
    count(distinct me.hotpointid)
        over (partition by trunc(sampledate, 'MM')) as tothpcount
from upme me
join appmday pm on pm.apid = me.id
where pm.sampledate between to_date('2011-11-01', 'yyyy-MM-dd')
    and to_date('2012-1-31', 'yyyy-MM-dd');

Or instead of a distinct, use a subquery and a row_number() calculation to remove the duplicates:

select cityid, to_char(timepoint, 'YYYY-MM') as timepoint,
    mecount, hpcount, totmecount, tothpcount
from (
    select pm.city cityid,
        row_number() over (partition by pm.city, trunc(pm.sampledate, 'MM')
            order by pm.city, trunc(pm.sampledate, 'MM')) as rn,
        trunc(pm.sampledate, 'MM') as timepoint,
        count(distinct me.id)
            over (partition by pm.city, trunc(pm.sampledate, 'MM')) as mecount,
        count(distinct me.hotpointid)
            over (partition by pm.city, trunc(pm.sampledate, 'MM')) as hpcount,
        count(distinct me.id)
            over (partition by trunc(pm.sampledate, 'MM')) as totmecount,
        count(distinct me.hotpointid)
            over (partition by trunc(sampledate, 'MM')) as tothpcount
    from upme me
    join appmday pm on pm.apid = me.id
    where pm.sampledate between to_date('2011-11-01', 'yyyy-MM-dd')
        and to_date('2012-1-31', 'yyyy-MM-dd')
)
where rn = 1;

If your existing query is slow you may see a performance difference between the two options, otherwise it's a matter or preference.

(Also you may be aware of this, but if your sampledate has a time component, having your where clause end on '2012-1-31' means you'll only get data up to the start of that day, and not for example anything at 2012-01-31 01:00.)

score 1 · Answer 2 · answered Feb 20 '12 at 09:07

You would need to use analytic functions to get that result. I'm not completely sure that I understand exactly what results you're looking for, but something like this would appear to be what you're looking for.

select distinct
       pm.city cityid, 
       to_char(pm.sampledate, 'yyyy-MM') timepoint, 
       count(distinct   me.id) 
         over (partition by to_char(pm.sampledate, 'yyyy-mm'), pm.city) mecount_1, 
       count(distinct me.hotpointid) 
         over (partition by to_char(pm.sampledate, 'yyyy-mm'), pm.city) hpcount_1,
       count(distinct   me.id) 
         over (partition by to_char(pm.sampledate, 'yyyy-mm')) mecount_2, 
       count(distinct me.hotpointid) 
         over (partition by to_char(pm.sampledate, 'yyyy-mm')) hpcount_2
  from upme me,
       appmday pm
 where pm.apid = me.id
   and pm.sampledate between to_date('2011-11-01', 'yyyy-MM-dd') and
          to_date('2012-1-31', 'yyyy-MM-dd')

SQL Data aggregation

2 Answers2

Linked