I have a table that stores the information about user calls in a call center. The table has a call_id, date when the call was made, actual date and time of the call, call type and a score associated with the call.
My requirement is to calculate a 40 day moving average of the score with respect to the call day. The 40 day should start from the previous day from the call date. If there are no call in the past 40 days it should include rows for the call date for which the moving average is being calculated.
Below is sample data:
select * from test_aes;
Output:
call_id | call_dt_key | call_type_id | call_dt_tm | aes_raw
1 | 2016-01-01 | CT1 | 2016-01-01 00:00:10-08 | 10
2 | 2016-01-01 | CT1 | 2016-01-01 00:00:20-08 | 20
3 | 2016-01-01 | CT1 | 2016-01-01 00:00:30-08 | 10
4 | 2016-01-01 | CT1 | 2016-01-01 00:00:40-08 | 20
5 | 2016-01-01 | CT1 | 2016-01-01 00:00:50-08 | 10
6 | 2016-01-01 | CT1 | 2016-01-01 00:01:00-08 | 20
7 | 2016-01-01 | CT1 | 2016-01-01 00:02:00-08 | 10
8 | 2016-01-01 | CT1 | 2016-01-01 00:03:00-08 | 20
9 | 2016-01-01 | CT1 | 2016-01-01 00:04:00-08 | 10
10 | 2016-01-01 | CT1 | 2016-01-01 00:05:00-08 | 20
11 | 2016-01-05 | CT1 | 2016-01-05 00:00:10-08 | 10
12 | 2016-01-05 | CT1 | 2016-01-05 00:00:20-08 | 10
13 | 2016-01-05 | CT1 | 2016-01-05 00:00:30-08 | 20
14 | 2016-01-05 | CT1 | 2016-01-05 00:00:40-08 | 20
15 | 2016-01-05 | CT1 | 2016-01-05 00:00:50-08 | 20
16 | 2016-01-10 | CT1 | 2016-01-10 00:00:10-08 | 10
17 | 2016-01-10 | CT1 | 2016-01-10 00:00:20-08 | 20
18 | 2016-01-15 | CT1 | 2016-01-15 00:00:10-08 | 10
19 | 2016-01-15 | CT1 | 2016-01-15 00:00:20-08 | 20
20 | 2016-01-15 | CT1 | 2016-01-15 00:00:30-08 | 20
21 | 2016-01-16 | CT1 | 2016-01-16 00:00:10-08 | 20
22 | 2016-01-16 | CT1 | 2016-01-16 00:00:20-08 | 10
23 | 2016-01-16 | CT1 | 2016-01-16 00:00:30-08 | 20
24 | 2016-01-20 | CT1 | 2016-01-20 00:00:10-08 | 20
25 | 2016-01-20 | CT1 | 2016-01-20 00:00:20-08 | 10
26 | 2016-01-21 | CT1 | 2016-01-21 00:00:10-08 | 10
27 | 2016-01-21 | CT1 | 2016-01-21 00:00:20-08 | 20
28 | 2016-01-31 | CT1 | 2016-01-31 00:00:10-08 | 10
29 | 2016-01-31 | CT1 | 2016-01-31 00:00:20-08 | 20
30 | 2016-02-01 | CT1 | 2016-02-01 00:00:10-08 | 10
31 | 2016-02-01 | CT1 | 2016-02-01 00:00:20-08 | 20
32 | 2016-02-10 | CT1 | 2016-02-10 00:00:10-08 | 10
33 | 2016-02-10 | CT1 | 2016-02-10 00:00:20-08 | 20
34 | 2016-02-15 | CT1 | 2016-02-15 00:00:15-08 | 10
35 | 2016-02-15 | CT1 | 2016-02-15 00:00:20-08 | 20
36 | 2016-02-26 | CT1 | 2016-02-26 00:00:15-08 | 10
37 | 2016-02-26 | CT1 | 2016-02-26 00:00:20-08 | 20
38 | 2016-03-04 | CT1 | 2016-03-04 00:00:15-08 | 10
39 | 2016-03-04 | CT1 | 2016-03-04 00:00:20-08 | 20
40 | 2016-03-18 | CT1 | 2016-03-18 00:00:15-07 | 10
41 | 2016-03-18 | CT1 | 2016-03-18 00:00:20-07 | 20
Thus the output should be:
call_dt_key | average_40
2016-01-01 | 15.0000 (include rows for 2016-01-01)
2016-01-05 | 15.0000 (don't include rows for 2016-01-05)
2016-01-10 | 15.3333 (don't include rows for 2016-01-10)
2016-01-15 | 15.2941 (don't include rows for 2016-01-15)
2016-01-16 | 15.5000 (don't include rows for 2016-01-16)
2016-01-20 | 15.6522 (don't include rows for 2016-01-20)
2016-01-21 | 15.6000 (don't include rows for 2016-01-21)
2016-01-31 | 15.5556 (don't include rows for 2016-01-31)
2016-02-01 | 15.5172 (don't include rows for 2016-02-01)
2016-02-10 | 15.4839 (start date 2015-12-31 end date 2016-02-09)
2016-02-15 | 15.6522 (start date 2016-01-05 end date 2016-02-14)
2016-02-26 | 15.3333 (start date 2016-01-16 end date 2016-02-25)
2016-03-04 | 15.0000 (start date 2016-01-23 end date 2016-03-03)
2016-03-18 | 15.0000 (start date 2016-02-06 end date 2016-03-17)
Schema and test data at below link: SQL Fiddle
I cannot use ROWS in an AVG window definition because test_aes has thousands of rows for a given day.