Solved: Re: Need help with set analysis - Qlik Community

Quy_Nguyen · ‎2019-10-09

Hi everyone,

I have data like this bellow table:

ID	Position	Item	Quantity	OrderTime
1	1	A	1	0
1	2	A	1	2
1	3	B	1	0
1	4	B	1	4
2	5	A	1	1
2	6	B	1	5
2	9	C	1	4
3	7	A	1	0
3	8	A	1	5

This data table is about detail order, we have the quantity and the order time (in minute from the order create time) for each item sold. I want to calculate how many items (total quantity) that have re-purchase in an order after x minutes (x is a time variable)

For example, let's talk about Order ID 3. I have item A was bought 2 times at 0 and 5 mins. I set the time variable to 2 mins. Then I want my expression return the value : 1. I will say that customer bought item A before 2 mins ( <) and another item A after 2mins ( >= ). The return value is the amount of up-sell. If I set the time variable to 0 or 5, the value should return 0.

Another example for the whole table above, I set the time variable to 2 and I want to return 3 for the total quantity of up-sell. ( 2 up-sell in Order 1 and 1 up-sell in Order 3).

This is my current expression, and it returns 4, included one position in the Order2, which is wrong.

vOrderTime = 2

sum({$<ID = p({<[OrderTime] = {"<$(vOrderTime)"}>} [ID])* p({<[OrderTime] = {">=$(vOrderTime)"}>} [ID]),
Item = p({<[OrderTime] = {"<$(vOrderTime)"}>} Item)* p({<[OrderTime] = {">=$(vOrderTime)"}>} Item),
Position = p({<[OrderTime] = {">=$(vOrderTime)"}>} Position)>} Quantity)

I understand the logic is: get all positions which in the same order, having same Item purchased before and after the vOrderTime, then filter positions which have OrderTime >= vOrderTime .

I hope that you got what I am trying to explain. Any help would be appreciated!

thi_pham · ‎2019-10-10

One more try, kaka:

sum(aggr(sum(if(IDItemNoOfOccurence > 1 and OrderTime > 2, Quantity, 0)), ID, Item))

with a new calculated column: (can be done on script):

TempTable:
Load *, rowno() as row Inline [
ID,Position,Item,Quantity,OrderTime,
1,1,A,1,0
1,2,A,1,2
1,3,B,1,0
1,4,B,1,4
2,5,A,1,1
2,6,B,1,5
2,9,C,1,4
3,7,A,1,0
3,8,A,1,5
];

NoConcatenate
TempTable2:
Load AutoNumber(RecNo(),ID&Item) as IDItemNoOfOccurence,
ID,Position,Item,Quantity,OrderTime
Resident TempTable
Order by ID, Item, row;

drop table TempTable;

View solution in original post

vunguyenq89 · ‎2019-10-10

Hi Quy,

Try this expression

=Sum(
	Aggr(
		Sum(If(OrderTime >= vOrderTime , Quantity,0)) // sum all rows having OrderTime > vOrderTime
        * Count(DISTINCT If(OrderTime < vOrderTime , Item)), //above sum will be zeroed if this count returns zero (there is no item purchased before vOrderTime)
        ID,Item), // for each (ID,Item) context
)

This should be used outside of tables which contains both columns (ID, Item). Otherwise the result will be affected by the table context.

Generally Set Analysis is unable to solve inter-record problems. This case is such problem: calculation on each row (sum quantity of each ID-Item) depends on values of other rows (whether any of the same Item has been purchased in the same order before vOrderTime)

Hope this helps 🙂

View solution in original post

Lauri · ‎2019-10-09

Are you sure you can use * like that? I have always done it between two <> expressions, like:

<ID = p({<[OrderTime] = {"<$(vOrderTime)"}>} [ID])> * <ID = p({<[OrderTime] = {">=$(vOrderTime)"}>} [ID])>

I also think your problem is using * instead of +. With *, you are asking for the IDs where the OrderTime is less than vOrderTime AND greater than vOrderTime -- impossible! You want to use + to get one OR the other.

Quy_Nguyen · ‎2019-10-09

Thanks Laurischaf,

I think the operator could be use like what I am doing. And the ID is the order ID not the Position (order detail), so we can say "get a set of order ID where the item OrderTime is less than vOrderTime AND greater than vOrderTime" - in my opinion its logic is correct.

By the way, could you please go further with your idea and help me with an example one?

thi_pham · ‎2019-10-10

Is it compulsory to calculate it on measure with set analytics. I think it should be easier to handle it on data load with peek.

My idea is to use peek/previous function with right condition to mark records which is up-sell record, like:

new column with formular : if (ID = peek(ID) and item = peek(Item) and OrderTime > $(OrderTime), 'Y' , 'N') as IsUpSell.

Then, further calculation would not be a problem.

Hope it help.

Quy_Nguyen · ‎2019-10-10

Thanks a Thi.

It's good to solve it at the script level when my variable time is fixed. I am going to have a slider for the vOrderTime, so it must be solved in the chart level. I think it is feasible, but currently I can't make it 😄

thi_pham · ‎2019-10-10

I see.

How about this: sum(aggr(sum(if(OrderTime > $(vOrderTime), Quantity, 0)), ID, Item))

Quy_Nguyen · ‎2019-10-10

Thanks. But it doesn't return the correct value.

thi_pham · ‎2019-10-10