Solved: Re: Intervalmatch on several fields with AND opera... - Page 2 - Qlik Community

Amphan · ‎2014-05-14

I am trying to figure out how to categorize a transaction depending on whether several different fields in the transaction are within a certain range. In my current task I have seven fields, let's call them ACCODE01 - 07 and seven corresponding ranges, ACCODE01_START, ACCODE01_END, ACCODE02_START, ACCODE02_END etc. If each ACCODEXX field is within its corresponding range, ie. ACCODE01 is in ACCODE01_START to ACCODE01_END AND ACCODE02 is in ACCODE02_START to ACCODE02_END AND etc. then I want to mark the transaction with a Category.

Let's say I have these tables (random data, might not add up), what I want is to add CATEGORY from INTERVAL: to FACT: where ACCODEXX is in THE ACCODEXX_START to ACCODEXX_END range. If there is no match, no category should be assigned.

FACT:

LOAD * INLINE [

ACCODE01, ACCODE02, ACCODE03, ACCODE04, ACCODE05, ACCODE06, ACCODE07, VERID, AMOUNT

10000, 40261, 12340, 12300, 5478, 2541, 2587, 123456789, $100

20000, 86210, 43210, 45600, 9866, 2365, 1236, 987546, $1000

30000, 86400, 58740, 78900, 2222, 5895, 9854, 124578, $200

40000, 86111, 98560, 98700, 9874, 4578, 7854, 326598, $1100

50000, 46435, 12450, 65400, 2587, 1245, 2569, 785612, $800

60000, 45698, 45780, 32100, 3698, 6542, 3254, 324578, $500

70000, 32145, 65540, 74100, 1478, 1111, 1256, 1245782, $600

80000, 78954, 32120, 36900, 2589, 2224, 9854, 69854225, $1900

];

INTERVAL:

LOAD * INLINE [

CATEGORY, ACCODE01_START, ACCODE01_END, ACCODE02_START, ACCODE02_END, ACCODE03_START, ACCODE03_END, ACCODE04_START, ACCODE04_END, ACCODE05_START, ACCODE05_END, ACCODE06_START, ACCODE06_END, ACCODE07_START, ACCODE07_END

APPLES, 0000, 8999, 40260, 40270, 00000, 96999, 0000, 9999, 0000, 9999, 0000, 9999, 0000, 0000

ORANGES, 0000, 8999, 65410, 65510, 00000, 96999, 0000, 9999, 0000, 9999, 0000, 9999, 0000, 0000

BANANAS, 0000, 8999, 86110, 86710, 03560, 03560, 0000, 9999, 0000, 9999, 0000, 9999, 0000, 9999

CHERRYS, 0000, 8999, 86310, 86330, 03560, 03560, 0000, 9999, 0000, 9999, 0000, 9999, 0000, 9999

LIMES, 0000, 8999, 86400, 86400, 03560, 03560, 0000, 9999, 0000, 9999, 0000, 9999, 0000, 9999

LEMONS, 0000, 8999, 46430, 46440, 00000, 96999, 1000, 1099, 0000, 9999, 0000, 9999, 0000, 0000

];

I'd really appreciate any help to move forward on this task.

Thanks in advance!

Ola

Report Inappropriate Content · ‎2014-05-15

Yeah, cartesian will make very big table. How many rows do you have on the

beginning im fact and interval tables?

15-05-2014 00:28 użytkownik "Ola Paulsson" <qcwebmaster@qlik.com>

napisał:

Amphan · ‎2014-05-15

The fact table will have around 8-10 million rows but the interval table will only have around 100 rows (100 categories) or less. Each accode field holds an integer with 5 positions so with full interval ranges the cartesian product from that table would have a max of 100000 raised to 7 rows, am I right? In reality it will be lower as the interval ranges will be more narrow but still, it could be a huge table.

//Ola

Report Inappropriate Content · ‎2014-05-15

How many categories for one fact you expect separately by each key?

15-05-2014 08:42 użytkownik "Ola Paulsson" <qcwebmaster@qlik.com>

napisał:

Amphan · ‎2014-05-15

One, if I understand you correctly. Every transaction (row) in the fact table can only have one category. The category is determined by whether each of the accode fields are within range in its corresponding interval. These combinations of intervals in the intervals table are disjoint and not overlapping (unique) as a composite key.

//Ola

Report Inappropriate Content · ‎2014-05-15

If so, my example should work as you expect. It founds one category by each

key and on the end checks If all 7 categories are the same. If YES, fact

has this category. If no, fact has no category. Is it the logice you need?

15-05-2014 09:22 użytkownik "Ola Paulsson" <qcwebmaster@qlik.com>

napisał:

Qlik Community <http://community.qlik.com/> Intervalmatch on
several fields with AND operation?
reply from Ola Paulsson<http://community.qlik.com/people/OlaPaulsson?et=watches.email.thread>in
Scripting - View the full discussion<http://community.qlik.com/message/528866?et=watches.email.thread#528866>

Amphan · ‎2014-05-15

I will need to take a closer look at the example code you provided, it is a bit above my skill level so I need some time to understand it . In the meantime I think I found a solution in the document you linked to but I choose method 3 on page 6 that says "Don’t use this method unless you have very small tables". I will see if I can learn more during the weekend.

//Ola

Report Inappropriate Content · ‎2014-05-15

Ola,

script works and:

tmp_f: load 0 as VERID AutoGenerate(0);

//prepares empty table (to have it for 1'st join)

for i=1 to 7

//we have 7 keys, so we need to deal with each of them

//we join into existing tmp_f table (this is why i created it over loop)

join(tmp_f)

load ACCODE0$ as ACCODE, VERID Resident FACT;

//ACCODE0$(i) gives column ACCODE01,2,3... depending on iter number

tmp_i:

load ACCODE0$(i)_START as start, ACCODE0$(i)_END as end, CATEGORY as cat$(i), Autonumber(Num(ACCODE0$(i)_START) & '|' & Num(ACCODE0$(i)_END)) as IntervalID Resident inter;

//into tmp_i table i load i'th range boudary (columns ACCODE0$(i)_START and ACCODE0$(i)_END), but i create //from them key (interval_id) for each row

bridge:

IntervalMatch ( ACCODE ) load distinct start,end Resident tmp_i;

//create table by intervalmatch (ACCODE is key from tmp_f table used for this iteration , start and end are range //from tmp_i (temporary interval) for this iteration

//and we have to clear what our interval produces, because we would like to have only tmp_f table when we start //next iteration in our loop.

Join (tmp_f)

Load ACCODE, Autonumber(Num(start) & '|' & Num(end)) as IntervalID

Resident bridge;

Drop Table bridge;

Join (tmp_f) LOAD cat$(i), IntervalID resident tmp_i;

drop table tmp_i;

drop fields ACCODE,IntervalID;

NEXT;

//after last one iteration we should have tmp_f table where we have 7 category columns (cat1,cat2,....cat7)

//you may comment instructions under and look on it in your application.

//from this table we will check if all 7 cat have the same value.

//if max=min then all existing values must be the same.

//so, we need only check if the 7 cat have values (no nulls) and we may decide to give category for fact. Knowing //that all cat fieldsghave the same value i take first, so, my result is cat1

load VERID, if(RangeMaxString(cat1,cat2,cat3,cat4,cat5,cat6,cat7)=RangeMinString(cat1,cat2,cat3,cat4,cat5,cat6,cat7) and RangetextCount(cat1,cat2,cat3,cat4,cat5,cat6,cat7)=7, cat1) as category Resident tmp_f Where Exists(VERID);

drop table tmp_f;

Amphan · ‎2014-05-15

You are too kind! I will look at this tonight. In the meantime, I got this to work from page 6, method 3 in the document you linked to. I will see if I can use another method which doesn't create a cartesian product but this works on my testcase. But i would like to create a general method to categorize a transaction which isn't so memory intensive.

tmp_BridgeTable:

load distinct

_CompositeKey,

ACCODE01, ACCODE02, ACCODE03, ACCODE04, ACCODE05, ACCODE06, ACCODE07

resident FACT;

Join (tmp_BridgeTable)

load

_IntervalID,

CATEGORY,

ACCODE01_START, ACCODE01_END,

ACCODE02_START, ACCODE02_END,

ACCODE03_START, ACCODE03_END,

ACCODE04_START, ACCODE04_END,

ACCODE05_START, ACCODE05_END,

ACCODE06_START, ACCODE06_END,

ACCODE07_START, ACCODE07_END

resident INTERVAL;

BridgeTable:

load distinct _CompositeKey, _IntervalID

resident tmp_BridgeTable

where (ACCODE01_START <= ACCODE01 and ACCODE01 <= ACCODE01_END)

and (ACCODE02_START <= ACCODE02 and ACCODE02 <= ACCODE02_END)

and (ACCODE03_START <= ACCODE03 and ACCODE03 <= ACCODE03_END)

and (ACCODE04_START <= ACCODE04 and ACCODE04 <= ACCODE04_END)

and (ACCODE05_START <= ACCODE05 and ACCODE05 <= ACCODE05_END)

and (ACCODE06_START <= ACCODE06 and ACCODE06 <= ACCODE06_END)

and (ACCODE07_START <= ACCODE07 and ACCODE07 <= ACCODE07_END)

;

drop table tmp_BridgeTable;

//Ola

Report Inappropriate Content · ‎2014-05-16

Hi Ola,

i looked for join from your script:

Join (tmp_BridgeTable)

What are commone keys in joindet tables?

regards

Darek

Amphan · ‎2014-05-16

When loading FACT: I add

and when I load Interval I add

RecNo() as _IntervalID,

which I use in the BridgeTable to link them. Solves the problem and doesn't seem to be too bad on performance, but then again there isn't that much data.

/Ola

Intervalmatch on several fields with AND operation?