Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi everyone,
this is my first post and I apologize if this has been asked before (actually, I couldn't find any proper solution) or if I'm missing to share some info.
Anyway, my problem is the following, I have a table with some data:
DATA |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
9 |
10 |
and I would like to cumulate its values based on a "day" criteria. In other words, I'm trying to get this output:
DATA | 2 days | 3 days | 4 days | 5 days |
1 | ||||
2 | 3 | |||
3 | 6 | |||
4 | 10 | 10 | ||
5 | 15 | |||
6 | 21 | 21 | ||
7 | ||||
8 | 36 | 36 | ||
9 | 45 | |||
10 | 55 | 55 |
so, this means that I will need to cumulate starting always from the top of "DATA" (value=1) but skipping rows based on headers.
I hope my problem is clear
Thanks a lot for all your support!
A crude script solution could be like:
Load
DATA,AccDATA,
If(Mod(RowNo(),2)=0, AccDATA) as Day2,
If(Mod(RowNo(),3)=0, AccDATA) as Day3,
If(Mod(RowNo(),4)=0, AccDATA) as Day4,
If(Mod(RowNo(),5)=0, AccDATA) as Day5,
If(Mod(RowNo(),6)=0, AccDATA) as Day6,
If(Mod(RowNo(),7)=0, AccDATA) as Day7 ;
Load
DATA,
If(RowNo()=1,DATA, RangeSum(Peek('AccDATA'), DATA)) as AccDATA
;
Load * Inline [
DATA
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
];
Drop Field AccDATA;
Can you clarify the rule?
Yes sure.
The idea is to cumulate "DATA" values, always from the top ("DATA" = 1). Each new column represents the frequency of the cumulate action.
For example, under "2days" column it means I have to cumulate data from the top every 2 days. So, it would be:
(there was an error on the table, now it's right)
other questions :
the columns 2days->5days are fixed? always 4?
if I follow this logic I get for the 2days the values in the table below, how did you remove the other values and leave only what you display?
DATA | 2 days | 3 days | 4 days | 5 days |
1 | ||||
2 | 3 | |||
3 | 6 | 6 | ||
4 | 10 | 10 | ||
5 | 15 | 15 | ||
6 | 21 | 21 | ||
7 | 28 | |||
8 | 36 | 36 | ||
9 | 45 | 45 | ||
10 | 55 | 55 |
I have never seen a data analytics problem like this so far. But not a problem, we are here to learn/know new things. Coming back to your problem. How far can your data go? How do you decide that you have to stop at Day 5 here? Also could you explain your business case for this?
Thanks for your comment.
You got the idea, now my problem is removing the other values and display only values that you see in my table.
I can do it manually as in my example but I've got thousands of rows and they will increase in the future.
how to remove them? the rule ?
and 2->5 are fixed ?
Actually, it can go over 5. It was just to give an idea of my problem. Let's say this since our thoughts are based on a calendar we are considering weeks (from every day till every 7 days), months (will be our next scope)
My business case is the following:
-values under "DATA" are the request of notifications that we receive (so they are not that linear). Every day new people are asking to get notified.
So what we are trying to understand is how often we should notify them sand since every day there are new members we will need to consider both new and old requests. A the end we will have a big table with all this information that will help us decide.
Hope this help
Thanks a lot
Taoufiq,
The rule part is probably like:
they would appear in rows where they are divisible by the corresponding day number
But your 2->5 query remains to be answered.
2 -> 5 are fixed in my example.
But this number could increase if we want to evaluate new options.
Right now we are working on a range that goes from every day till every 7 days (so 1->7)