Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi team,
I am trying to do something tricky and could use some help. I have a number of dates for which I also have a number of transactions. Assume that, as a first step, we do a sum of these transactions per day using a group by but still the group by wont only be on date so we might have a various number of rows per each date.
What I want to do is calculate for each row the average of these sums for that date and the 6 previous ones and save it as a new field. It is worth noting that I dont have all calendar dates available and as I said could be more than one rows per date even after group by so i am not sure how helpful functions like previous () etc can be unless the date can be checked specifically..
Any ideas? This has to be done in the script..
Thanks
Just provided sample data\
DATE | Employee | Amount 1 |
10/1/2015 | Adam | 5 |
10/1/2015 | George | 5 |
10/1/2015 | George | 10 |
10/1/2015 | George | 5 |
10/1/2015 | George | 5 |
09/30/2015 | Adam | 5 |
09/30/2015 | Adam | 5 |
09/30/2015 | Adam | 10 |
09/30/2015 | George | 5 |
09/28/2015 | George | 10 |
09/28/2015 | George | 5 |
09/28/2015 | George | 10 |
09/23/2015 | Adam | 5 |
09/23/2015 | George | 10 |
09/23/2015 | George | 10 |
09/23/2015 | George | 10 |
Now imagine the first row like this instead
10/1/2015 Adam (average of Amount 1 for Adam for dates 10/1, 9/30, 9/29, 9/28, 9/27, 9/26) = 25/4
Thats even though not all dates exist
Ah the average is also by employee, so:
tmp1_dailytransactions:
load
transac_date
, employee_id
, sum(amount) as total_amount
resident transaction_table
group by
transac_date
, employee_id ;
tmp2_avgtransactions:
load
transac_date
, employee_id
, date(transac_date+6) as transac_window_date
, amount
resident transaction_table ;
inner join (tmp1_dailytransactions)
interval match (transac_date, employee_id)
load distinct
transac_date
, transac_window_date
, employee_id
resident tmp2_avgtransactions ;
// Sorry there was a mistake here in the previous code, amount can't be in the interval match join
inner join (tmp1_dailytransactions)
load
transac_date
, transac_window_date
, employee_id
, amount
resident tmp2_avgtransactions ;
dailytransactions:
load
transac_date
, employee_id
, total_amount
, avg(amount) as avg_amount
resident tmp1_dailytransactions
group by
transac_date
, employee_id
, total_amount ;
drop tables tmp1_dailytransactions, tmp2_avgtransactions ;
Although I can follow the idea it doesn't seem to work. In my model I start with one table only (so transaction_table is no longer needed) which you can get by copy pasting this. So you can assume we start from your tmp1_dailytransactions that is.
Data:
LOAD date(BUSDATE) as BUSDATE, Employee, sum(Amount) as TAmount
group by date(BUSDATE), Employee;
LOAD * INLINE [
BUSDATE, Employee, Amount
1/10/2015, Adam, 5
1/10/2015, George, 5
1/10/2015, George, 10
1/10/2015, George, 5
1/10/2015, George, 5
30/09/2015, Adam, 5
30/09/2015, Adam, 5
30/09/2015, Adam, 10
30/09/2015, George, 5
28/09/2015, George, 10
28/09/2015, George, 5
28/09/2015, George, 10
23/09/2015, Adam, 5
23/09/2015, George, 10
23/09/2015, George, 10
23/09/2015, George, 10
];
Then I am trying to create the inner join and the interval match unsuccessfully..
Worth mentioning that we need to go 6 days BACK so date(transac_date - 6) as transac_window_date
Hi Joseph,
Use two variables and assign the selected date in one variable and assign the (selected date -6) in 2nd variable.
and in the expression using set analysis you can calculate the average.
BR,
SK
No transaction_table is useful because it is used twice. Create it with your inline table.
Then, if you want to go 6 days back, it's +6 and not -6. For example with the lines "23/09/2015, George, 10", with the +6, transac_windows_date will be the 29th. So then, when you will join with the total amount of the 29th, the line from the 23rd will be picked up. I know it's not natural but when you want to associated past data with future data it's easier to move forward the date of the past.
Ok so it runs now but not giving the desired results, by mirroring the above I get
transac_date, employee_id , total_amount, avg_amount
1/10/2015 Adam 5 6.66667
30/09/2015 George 5 6.66667
so dates & employees combos missing and the total and average amount is not right either. What i am after is one row per date&employee giving the total amount (thats the tmp1_dailytransactions table basically) and then the avg amount which for 1/10/2015 and Adam is all Adam's amount for all days till 25/09, so that's it
(5 + 5 + 5 + 10) / 4 = 6.25
Hi Santhosh,
This needs to be done in the script unfortunately
Sorry I have no QlikView to test (no version for Android yet ).
I think that the Inner Join are a mistake, replace all of them by left join.
Are you following this code by eye? Yes I had changed that myself which fixed the missing rows thing but averages still wrong:
BUSDATE | Employee | TAmount | avg_amount |
---|---|---|---|
1/10/2015 | Adam | 5 | 6.6666667 |
1/10/2015 | George | 25 | |
30/09/2015 | Adam | 20 | |
30/09/2015 | George | 5 | 6.6666667 |
28/09/2015 | George | 25 | |
23/09/2015 | Adam | 5 | |
23/09/2015 | George | 30 |
I have attached the whole script for you to check when you get the chance