Skip to main content
Announcements
NEW: Seamless Public Data Sharing with Qlik's New Anonymous Access Capability: TELL ME MORE!
cancel
Showing results for 
Search instead for 
Did you mean: 
Quy_Nguyen
Specialist
Specialist

Need help with set analysis

Hi everyone,

I have data like this bellow table:

IDPositionItemQuantityOrderTime
11A10
12A12
13B10
14B14
25A11
26B15
29C14
37A10
38A15

 

This data table is about detail order, we have the quantity and the order time (in minute from the order create time) for each item sold. I want to calculate how many items (total quantity) that have re-purchase in an order after x minutes (x is a time variable)

For example, let's talk about Order ID 3. I have item A was bought 2 times at 0 and 5 mins. I set the time variable to 2 mins. Then I want my expression return the value : 1. I will say that customer bought item A before 2 mins ( <) and another item A after 2mins ( >= ). The return value is the amount of up-sell. If I set the time variable to 0 or 5, the value should return 0.

Another example for the whole table above, I set the time variable to 2 and I want to return 3 for the total quantity of up-sell. ( 2 up-sell in Order 1 and 1 up-sell in Order 3).

This is my current expression, and it returns 4, included one position in the Order2, which is wrong.

vOrderTime = 2

 

 

sum({$<ID = p({<[OrderTime] = {"<$(vOrderTime)"}>} [ID])* p({<[OrderTime] = {">=$(vOrderTime)"}>} [ID]),
Item = p({<[OrderTime] = {"<$(vOrderTime)"}>} Item)* p({<[OrderTime] = {">=$(vOrderTime)"}>} Item),
Position = p({<[OrderTime] = {">=$(vOrderTime)"}>} Position)>} Quantity)

 

 

I understand the logic is: get all positions which in the same order, having same Item purchased before and after the vOrderTime, then filter positions which have OrderTime >= vOrderTime .

I hope that you got what I am trying to explain. Any help would be appreciated!

2 Solutions

Accepted Solutions
thi_pham
Creator III
Creator III

One more try, kaka: 

sum(aggr(sum(if(IDItemNoOfOccurence > 1 and OrderTime > 2, Quantity, 0)), ID, Item))

with a new calculated column: (can be done on script): 

TempTable:
Load *, rowno() as row Inline [
ID,Position,Item,Quantity,OrderTime,
1,1,A,1,0
1,2,A,1,2
1,3,B,1,0
1,4,B,1,4
2,5,A,1,1
2,6,B,1,5
2,9,C,1,4
3,7,A,1,0
3,8,A,1,5
];

NoConcatenate
TempTable2:
Load AutoNumber(RecNo(),ID&Item) as IDItemNoOfOccurence,
ID,Position,Item,Quantity,OrderTime
Resident TempTable
Order by ID, Item, row;

drop table TempTable;

View solution in original post

vunguyenq89
Creator III
Creator III

Hi Quy,

Try this expression

=Sum(
	Aggr(
		Sum(If(OrderTime >= vOrderTime , Quantity,0)) // sum all rows having OrderTime > vOrderTime
        * Count(DISTINCT If(OrderTime < vOrderTime , Item)), //above sum will be zeroed if this count returns zero (there is no item purchased before vOrderTime)
        ID,Item), // for each (ID,Item) context
)

This should be used outside of tables which contains both columns (ID, Item). Otherwise the result will be affected by the table context.

TOTAL.png

Generally Set Analysis is unable to solve inter-record problems. This case is such problem: calculation on each row (sum quantity of each ID-Item) depends on values of other rows (whether any of the same Item has been purchased in the same order before vOrderTime)

Hope this helps 🙂

View solution in original post

12 Replies
Lauri
Specialist
Specialist

Are you sure you can use * like that? I have always done it between two <> expressions, like:

<ID = p({<[OrderTime] = {"<$(vOrderTime)"}>} [ID])> * <ID = p({<[OrderTime] = {">=$(vOrderTime)"}>} [ID])>

I also think your problem is using * instead of +. With *, you are asking for the IDs where the OrderTime is less than vOrderTime AND greater than vOrderTime -- impossible! You want to use + to get one OR the other.

Quy_Nguyen
Specialist
Specialist
Author

Thanks Laurischaf,

I think the operator could be use like what I am doing. And the ID is the order ID not the Position (order detail), so we can say "get a set of order ID where the item OrderTime is less than vOrderTime AND greater than vOrderTime" - in my opinion its logic is correct.

By the way, could you please go further with your idea and help me with an example one?

thi_pham
Creator III
Creator III

Is it compulsory  to calculate it on measure with set analytics. I think it should be easier to handle it on data load with peek.

My idea is to use peek/previous function with right condition to mark  records which is up-sell record, like:

new column with formular : if (ID = peek(ID) and item = peek(Item) and OrderTime > $(OrderTime), 'Y' , 'N') as IsUpSell.

Then, further calculation would not be a problem. 

Hope it help.

Quy_Nguyen
Specialist
Specialist
Author

Thanks a Thi.

It's good to solve it at the script level when my variable time is fixed. I am going to have a slider for the vOrderTime, so it must be solved in the chart level. I think it is feasible, but currently I can't make it 😄

thi_pham
Creator III
Creator III

I see.

How about this: sum(aggr(sum(if(OrderTime > $(vOrderTime), Quantity, 0)), ID, Item))

Quy_Nguyen
Specialist
Specialist
Author

Thanks. But it doesn't return the correct value.

thi_pham
Creator III
Creator III

One more try, kaka: 

sum(aggr(sum(if(IDItemNoOfOccurence > 1 and OrderTime > 2, Quantity, 0)), ID, Item))

with a new calculated column: (can be done on script): 

TempTable:
Load *, rowno() as row Inline [
ID,Position,Item,Quantity,OrderTime,
1,1,A,1,0
1,2,A,1,2
1,3,B,1,0
1,4,B,1,4
2,5,A,1,1
2,6,B,1,5
2,9,C,1,4
3,7,A,1,0
3,8,A,1,5
];

NoConcatenate
TempTable2:
Load AutoNumber(RecNo(),ID&Item) as IDItemNoOfOccurence,
ID,Position,Item,Quantity,OrderTime
Resident TempTable
Order by ID, Item, row;

drop table TempTable;

vunguyenq89
Creator III
Creator III

Hi Quy,

Try this expression

=Sum(
	Aggr(
		Sum(If(OrderTime >= vOrderTime , Quantity,0)) // sum all rows having OrderTime > vOrderTime
        * Count(DISTINCT If(OrderTime < vOrderTime , Item)), //above sum will be zeroed if this count returns zero (there is no item purchased before vOrderTime)
        ID,Item), // for each (ID,Item) context
)

This should be used outside of tables which contains both columns (ID, Item). Otherwise the result will be affected by the table context.

TOTAL.png

Generally Set Analysis is unable to solve inter-record problems. This case is such problem: calculation on each row (sum quantity of each ID-Item) depends on values of other rows (whether any of the same Item has been purchased in the same order before vOrderTime)

Hope this helps 🙂

thi_pham
Creator III
Creator III

So clear.