Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

Calculating rolling 7 days averages in script

Hi team,

I am trying to do something tricky and could use some help. I have a number of dates for which I also have a number of transactions. Assume that, as a first step, we do a sum of these transactions per day using a group by but still the group by wont only be on date so we might have a various number of rows per each date.

What I want to do is calculate for each row the average of these sums for that date and the 6 previous ones and save it as a new field. It is worth noting that I dont have all calendar dates available and as I said could be more than one rows per date even after group by so i am not sure how helpful functions like previous () etc can be unless the date can be checked specifically..

Any ideas? This has to be done in the script..

Thanks

1 Solution

Accepted Solutions
vincent_ardiet
Specialist
Specialist

Hi Joseph,

Ok, so, there were 2 mistakes, first my intervalmatch should not use existing field names, so I have renamed in tmp2 the two dates. Then, it's the average of the daily transations and not the average of each transaction, so tmp2 is not more based on transaction_table but on tmp1.

I have also renamed the amount fields to have something easier to understand.

And yes the first code was written without Qlikview, else I will not have wrote "interval match" instead of "intervalmatch" .

Here is the code:


transaction_table:
LOAD * INLINE [
transac_date,employee_id,amount
1/10/2015,Adam,5
1/10/2015,George,5
1/10/2015,George,10
1/10/2015,George,5
1/10/2015,George,5
30/09/2015,Adam,5
30/09/2015,Adam,5
30/09/2015,Adam,10
30/09/2015,George,5
28/09/2015,George,10
28/09/2015,George,5
28/09/2015,George,10
23/09/2015,Adam,5
23/09/2015,George,10
23/09/2015,George,10
23/09/2015,George,10
];


tmp1_dailytransactions:
load
  transac_date
, employee_id
, sum(amount) as daily_amount
resident transaction_table
group by
  transac_date
, employee_id ;

tmp2_avgtransactions:
load
  transac_date as begin_window_date
, employee_id
, date(transac_date+6) as end_window_date
, daily_amount as daily_amount_window
resident tmp1_dailytransactions ;

left join (tmp1_dailytransactions)
intervalmatch (transac_date, employee_id)
load distinct
  begin_window_date
, end_window_date
, employee_id
resident tmp2_avgtransactions ;

left join (tmp1_dailytransactions)
load
  begin_window_date
, end_window_date
, employee_id
, daily_amount_window
resident tmp2_avgtransactions ;


dailytransactions:
load
  transac_date
, employee_id 
, daily_amount
, avg(daily_amount_window) as avg_daily_amount
resident tmp1_dailytransactions
group by
  transac_date
, employee_id 
, daily_amount ;

drop tables tmp1_dailytransactions, tmp2_avgtransactions ;


Regards,

Vincent

View solution in original post

22 Replies
vincent_ardiet
Specialist
Specialist

Hi Joseph,

If you have a SQL database in source, like Oracle or SQLServer you can use a partition by windowing function.

Else, if you can't, maybe you can create a second transaction date in your table which is 7 days after the real one. Then, with a interval match joining the table with itself you will be able to group all the transactions.

I hope it's clear :-).

Regards,

Vincent

datanibbler
Champion
Champion

Hi Joseph,

I understand you already have some chart displaying the sum of whatever per day?

There are functions in QlikView - BEFORE and AFTER I think, haven't yet found a use for them, but they ARE useful - which allow you to access the different "points" along your dimension which is the day.

I think that would be a good starting_point for a rolling average in a chart.

Sorry I cannot help you further, as I said I haven't yet done it.

Best regards,

DataNibbler

sujeetsingh
Master III
Master III

Use previous function in Qlikview or i will sugest is to have all dates sorted and use rowno() and assign them a rank then just take only Rank<=6 as average

Not applicable
Author

Hi Vincent,

That sounds promising.. Could you provide a short example if not too much to ask

Not applicable
Author

Hi,

Sorry just changed the description above, I can still have a various number of rows per date even after the group by, lets say by employee id

Anonymous
Not applicable
Author

Can you provide sample app or data.

Thanks ,

Bunny

Anonymous
Not applicable
Author

Rolling can be done by ,

Range sum

or Above function which helps in rolling previous days results

or by having aggr with range sum function.

thanks,

bunny

vincent_ardiet
Specialist
Specialist

I can't test what I'm writing so you may have some adjustments to do, so for example:

tmp1_dailytransactions:

load

  transac_date

, employee_id

, sum(amount) as total_amount

resident transaction_table

group by

  transac_date

, employee_id ;

tmp2_avgtransactions:

load

  transac_date

, date(transac_date+6) as transac_window_date

, amount

resident transaction_table ;

inner join (tmp1_dailytransactions)

interval match (transac_date)

load

  transac_date

, transac_window_date

, amount

resident tmp2_avgtransactions ;

dailytransactions:

load

  transac_date

, employee_id 

, total_amount

, avg(amount) as avg_amount

resident tmp1_dailytransactions

group by

  transac_date

, employee_id 

, total_amount ;

drop tables tmp1_dailytransactions, tmp2_avgtransactions ;


Not applicable
Author

Something like this

    

DATEEmployeeAmount 1
10/1/2015Adam5
10/1/2015George5
10/1/2015George10
10/1/2015George5
10/1/2015George5
09/30/2015Adam5
09/30/2015Adam5
09/30/2015Adam10
09/30/2015George5
09/28/2015George10
09/28/2015George5
09/28/2015George10
09/23/2015Adam5
09/23/2015George10
09/23/2015George10
09/23/2015George10

Now imagine the first row like this instead
10/1/2015   Adam   (average of Amount 1 for Adam for dates 10/1, 9/30, 9/29, 9/28, 9/27, 9/26) = 25/4

Thats even though not all dates exist