Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi all,
I have a dataset that has names, dates, and integer values.
For each name, I need an expression that returns the date at which the cumulative sum of the integer value equals *or* exceeds 3.
For example:
Bob,2022-05-12,1
Bob,2022-05-05,0
Bob,2022-04-26,3
Bob,2022-04-25,1
I need to return "2022-04-26" in this example, since the cumulative sum of the value looking back to 4/26 is 4.
FirstSortedValue doesn't seem to quite do the trick, since it would return 2022-04-25 with a rank of 3.
What's the best method to do something like this?
Thanks!
So I came up with a working solution.
First, I accumulated the values in a new field in the script. I'm not going to go into great detail here since there are other answers on this topic, but you need to load the table in, then re-load it in a resident load to be able to Order By the dimension(s) in questions plus date desc, then use Previous to see whether the dimensions match. If they don't, you re-start the accumulation field at whatever its value is on the current record, otherwise you use Peek to find the previous value and add the current value to it.
Once you have the accumulation field created, you can load the date in question like so:
Join (WhateverTable)
Load Dimension,
Min(Date) as DateValueExceeded
Where Value >= 3 //or whatever value you need the date for
Group By Dimension;
Hope this helps someone in the future, including me the next time I try to do this. 😉
can you post a larger sample data that best represents your actual data set with a few more scenarios with expected output
I can't share actual data of course... It would also in some ways obscure the problem since there are a lot more 0s for the integer value.
It's not really that complex - Basically, I need to look back from the present. As soon as the cumulative value hits 3 for that particular person, I need that date. A few examples:
Bob,2022-06-07,3
Bob,2022-06-05,0
Bob,2022-06-04,2
Bob,2022-06-03,1
Bob,2022-06-02,5
Bob,2022-06-01,0
Jen,2022-06-07,1
Jen,2022-06-06,0
Jen,2022-06-05,0
Jen,2022-06-04,1
Jen,2022-06-03,0
Jen,2022-06-02,1
Jen,2022-06-01,0
Joe,2022-06-07,0
Joe,2022-06-06,1
Joe,2022-06-05,0
Joe,2022-06-04,0
Joe,2022-06-03,1
Joe,2022-06-02,0
Joe,2022-06-01,0
Liz,2022-06,07,0
Liz,2022-06,06,0
Liz,2022-06,05,1
Liz,2022-06,04,0
Liz,2022-06,03,5
Liz,2022-06,02,0
Liz,2022-06,01,0
Rob,2022-06-07,2
Rob,2022-06-02,0
Rob,2022-05-27,0
This should return:
Bob 2022-06-07 (Value is 3 right away on the first record)
Jen 2022-06-02 (Cumulative value reaches 3 on June 2)
Joe - (Cumulative value for all records is less than 3)
Liz 2022-06-03 (Cumulative value jumps from 1 to 6 (ie >3) on June 3rd)
Rob - (Cumulative value for all records <3, just showing that there may not be data for each person every day and that shouldn't change results)
Hope this helps to clarify. Thanks!
I can't send real data, but here's an example that will hopefully make it reasonably clear:
Bob,2022-06-07,3
Bob,2022-06-06,0
Bob,2022-06-05,0
Bob,2022-06-04,1
Bob,2022-06-03,0
Bob,2022-06-02,2
Bob,2022-06-01,0
Jen,2022-06-07,0
Jen,2022-06-05,1
Jen,2022-06-04,5
Jen,2022-06-01,0
Joe,2022-06-07,0
Joe,2022-06-06,0
Joe,2022-06-05,0
Joe,2022-06-04,0
Joe,2022-06-03,0
Joe,2022-06-02,1
Joe,2022-06-01,0
Liz,2022-06-07,1
Liz,2022-06-05,0
Liz,2022-06-03,1
Liz,2022-06-01,1
The results would then be:
Bob 2022-06-07 (exceeds 3 right away on the first record)
Jen 2022-06-04 (goes from 1 to 6 on this day)
Joe - (Does not cumulatively have 3 ever)
Liz 2022-06-01 (Hits 3 on this day)
I just need to find the date, looking back from today, upon which the cumulative sum of the value column meets or exceeds 3 and return it.
This can be either a chart function or in the script, since the number of days before today is the only thing that's important - I don't need it to obey selections or anything.
Thanks!
Test... I've tried to reply to this topic twice but my replies are not appearing. 😞
Trying again...
With the following sample data set:
Bob,2022-06-07,1
Bob,2022-06-06,4
Bob,2022-06-05,0
Bob,2022-06-04,1
Bob,2022-06-03,0
Bob,2022-06-02,3
Bob,2022-06-01,1
Jen,2022-06-07,1
Jen,2022-06-06,0
Jen,2022-06-05,0
Jen,2022-06-04,1
Jen,2022-06-03,0
Jen,2022-06-02,0
Jen,2022-06-01,1
Joe,2022-06-07,0
Joe,2022-06-06,0
Joe,2022-06-03,1
Joe,2022-06-02,0
Joe,2022-06-01,0
Liz,2022-06-07,1
Liz,2022-06-04,2
Liz,2022-06-01,1
These should be the results:
Bob 2022-06-06
Jen 2022-06-01
Joe -
Liz 2022-06-04
Looking back from the most recent data with a cumulative sum:
Hope this helps to clarify, and thanks for any help!
So I came up with a working solution.
First, I accumulated the values in a new field in the script. I'm not going to go into great detail here since there are other answers on this topic, but you need to load the table in, then re-load it in a resident load to be able to Order By the dimension(s) in questions plus date desc, then use Previous to see whether the dimensions match. If they don't, you re-start the accumulation field at whatever its value is on the current record, otherwise you use Peek to find the previous value and add the current value to it.
Once you have the accumulation field created, you can load the date in question like so:
Join (WhateverTable)
Load Dimension,
Min(Date) as DateValueExceeded
Where Value >= 3 //or whatever value you need the date for
Group By Dimension;
Hope this helps someone in the future, including me the next time I try to do this. 😉