Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
I am trying to figure out how to categorize a transaction depending on whether several different fields in the transaction are within a certain range. In my current task I have seven fields, let's call them ACCODE01 - 07 and seven corresponding ranges, ACCODE01_START, ACCODE01_END, ACCODE02_START, ACCODE02_END etc. If each ACCODEXX field is within its corresponding range, ie. ACCODE01 is in ACCODE01_START to ACCODE01_END AND ACCODE02 is in ACCODE02_START to ACCODE02_END AND etc. then I want to mark the transaction with a Category.
Let's say I have these tables (random data, might not add up), what I want is to add CATEGORY from INTERVAL: to FACT: where ACCODEXX is in THE ACCODEXX_START to ACCODEXX_END range. If there is no match, no category should be assigned.
FACT:
LOAD * INLINE [
ACCODE01, ACCODE02, ACCODE03, ACCODE04, ACCODE05, ACCODE06, ACCODE07, VERID, AMOUNT
10000, 40261, 12340, 12300, 5478, 2541, 2587, 123456789, $100
20000, 86210, 43210, 45600, 9866, 2365, 1236, 987546, $1000
30000, 86400, 58740, 78900, 2222, 5895, 9854, 124578, $200
40000, 86111, 98560, 98700, 9874, 4578, 7854, 326598, $1100
50000, 46435, 12450, 65400, 2587, 1245, 2569, 785612, $800
60000, 45698, 45780, 32100, 3698, 6542, 3254, 324578, $500
70000, 32145, 65540, 74100, 1478, 1111, 1256, 1245782, $600
80000, 78954, 32120, 36900, 2589, 2224, 9854, 69854225, $1900
];
INTERVAL:
LOAD * INLINE [
CATEGORY, ACCODE01_START, ACCODE01_END, ACCODE02_START, ACCODE02_END, ACCODE03_START, ACCODE03_END, ACCODE04_START, ACCODE04_END, ACCODE05_START, ACCODE05_END, ACCODE06_START, ACCODE06_END, ACCODE07_START, ACCODE07_END
APPLES, 0000, 8999, 40260, 40270, 00000, 96999, 0000, 9999, 0000, 9999, 0000, 9999, 0000, 0000
ORANGES, 0000, 8999, 65410, 65510, 00000, 96999, 0000, 9999, 0000, 9999, 0000, 9999, 0000, 0000
BANANAS, 0000, 8999, 86110, 86710, 03560, 03560, 0000, 9999, 0000, 9999, 0000, 9999, 0000, 9999
CHERRYS, 0000, 8999, 86310, 86330, 03560, 03560, 0000, 9999, 0000, 9999, 0000, 9999, 0000, 9999
LIMES, 0000, 8999, 86400, 86400, 03560, 03560, 0000, 9999, 0000, 9999, 0000, 9999, 0000, 9999
LEMONS, 0000, 8999, 46430, 46440, 00000, 96999, 1000, 1099, 0000, 9999, 0000, 9999, 0000, 0000
];
I'd really appreciate any help to move forward on this task.
Thanks in advance!
Ola
Hi Ola,
are your ranges disjoint or may overlap (if they overlaping, you may get more than one category for one ACCODE)?
what if you will get 1 category by ACCODE01 and another by ACCODE02?
do you need after this transformation 1 category for each row?
regards
Darek
Hi Darek, the idea is that they should be disjoint, every matching transaction can only get one unique category or if no match then no category. If I didn't need to check for intervals I would just create a key of the ACCODE fields but I am at a loss as the fields should be matched to intervals. So I need to check for ACCODE01 in interval01 AND ACCODE02 in interval02 etc. for all seven fields/intervals and if all is true then set the category.
//Ola
for me:
do loop with 7 iterations
in each iteration:
intervalmatch 1 column and into FACT table.
after last one loop you should have FACT table with added 7 columns (cat1-cat7).
then load resident FACT table (columns you need) again with new name, using if(rangeminstring (col1......col7)=rangemaxstring (col1......col7) and rangemissingcount (col1....col7)=0, col1) as result_category.
drop table FACT.
regards
Darek
And here you have all about methods to do interval matching:
it maybe some start point.
regards
Darek
Thanks Darek, trying out your idea I realize that your earlier question was spot on. I can get more than one category for one ACCODE, it is all ACCODES together that is disjoint and may not overlap. If it were'nt for the intervals I would create a composite key. Trying out your idea above I got stuck on the first column in the interval table where all starts is 0000 and all ends is 8999 which would give me all categories on all ACCODE01s. Any ideas on how to solve this?
Thanks Darek, I will have a look at it.
Ok, lets look on it again. I can imagine, that for some fact i can get for
example:
By accode01 banana or Apple
By accode02 banana or lenin
....
By accode07 banana or Orange.
So, there is the question what category wins for this fact row.
What are the criteria? Is it possible, that more than one category for this
one fact row will have the same rank?
Regards
D
14-05-2014 23:41 użytkownik "Ola Paulsson" <qcwebmaster@qlik.com>
napisał:
Qlik Community <http://community.qlik.com/> Intervalmatch on
several fields with AND operation?
reply from Ola Paulsson<http://community.qlik.com/people/OlaPaulsson?et=watches.email.thread>in
Scripting - View the full discussion<http://community.qlik.com/message/528689?et=watches.email.thread#528689>
No, one row will always be unique, ie. accode01&accode02&accode03&accode04&accode05&accode06&accode07 will only match one category in interval table. No overlap for the composite key.
Maybe one approach would be to create a table with all possible combinations of intervals and categories and then check that against a composite key in the fact table? But i worry about the size of such a table.
Will work on this tomorrow but will appreciate any input you have! Many thanks!
//Ola