Solved: Optimize the loading time - Qlik Community

Smail_Barat · ‎2009-11-02

Hello everybody, I m developing a QlikView application to generate a table of monthly loans of customer from a source table "Oracle", The treatment consist of loading an Oracle table (see attached file) in QlikView and duplicating each row in oracle DB by the number of months of each loan and so on my problem is that the treatment take too much time when extracting data, so I want to optimize the loading time, I can't find a method that gives me the desired result but with a better response time. I've seen that there is a concept of Hash Tables in Qlikview, but I have not found any explanation on the Qlikview documentation. Any suggestion

Oleg_Troyansky · ‎2009-11-02

As far as I understand your script, the slowest part is processing 100K records one by one, in a nested loop... You need to get away from the loop and into a regular "LOAD" that would do the job...

Look at the syntax of a LOAD statement with WHILE clause - you can load all your table (100K rows) and ask QlikView to generate records while the condition is true. I'll give you a simple example:


Tab1:
Load
 EventID,
 MinDate,
 MaxDate
from
 ...
;
Tab2
load
EventID,
AddMonths(MinDate, IterNo() - 1) as GeneratedMonth
resident Tab1
WHILE
AddMonths(MinDate, IterNo() - 1) <= MaxDate
;

In this example, QlikView will generate individual rows with the specific Months as long as the generate Month is between MinDate and MaxDate. Notice, that function IterNo() represents the number of the iteration and it begins with 1 (that's why we usually subtract 1 for the first record to match the MinDate).

cheers,

Oleg Troyansky
Call me for expert support, performance tuning, advanced training and more!

View solution in original post

Oleg_Troyansky · ‎2009-11-02

As far as I understand your script, the slowest part is processing 100K records one by one, in a nested loop... You need to get away from the loop and into a regular "LOAD" that would do the job...

Look at the syntax of a LOAD statement with WHILE clause - you can load all your table (100K rows) and ask QlikView to generate records while the condition is true. I'll give you a simple example:


Tab1:
Load
 EventID,
 MinDate,
 MaxDate
from
 ...
;
Tab2
load
EventID,
AddMonths(MinDate, IterNo() - 1) as GeneratedMonth
resident Tab1
WHILE
AddMonths(MinDate, IterNo() - 1) <= MaxDate
;

In this example, QlikView will generate individual rows with the specific Months as long as the generate Month is between MinDate and MaxDate. Notice, that function IterNo() represents the number of the iteration and it begins with 1 (that's why we usually subtract 1 for the first record to match the MinDate).

cheers,

Oleg Troyansky
Call me for expert support, performance tuning, advanced training and more!

johnw · ‎2009-11-02


Oleg Troyansky wrote:Look at the syntax of a LOAD statement with WHILE clause - you can load all your table (100K rows) and ask QlikView to generate records while the condition is true. I'll give you a simple example:

How fast is the WHILE? It seems like you're still doing a loop, so I would expect it to be slow, but I haven't experimented on any large tables.

I'm actually working on a very similar requirement today, and I went with something significantly more complex to avoid looping on big tables. Maybe I'm just wasting time, but here's the general approach I have at the moment:

[All Months for Input Keys]:
// First we create a table of ALL months.
// The below approach assumes that all months have already been loaded in some table.
// There's a while loop, but not for very many records, as it doesn't depend on the key.
LOAD fieldvalue('month',iterno()) as "month"
AUTOGENERATE 1
WHILE len(fieldvalue('month',iterno()))
;
// We then break this out so that we have all months for all input keys
LEFT JOIN ([All Months for Input Keys])
LOAD DISTINCT "input key"
RESIDENT [Input]
;

[Output]:
// We start building our output table, adding a start and end month for each input key
LOAD
"input key"
,any other fields and expressions from the input data that you want
,expression for start month as "start month"
,expression for end month as "end month"
RESIDENT [Input]
;

[Match]:
// Now we build a table with input key, start month, end month, and ALL months between.
// The intervalmatch handles the magic of only including the months we want.
INTERVALMATCH ("month","input key")
LOAD
"start month"
,"end month"
,"input key"
RESIDENT [Output]
;
// If you wish, you can just keep the Match table, and everything should work even with
// the original number of rows on the output table. It may or may not be more efficient
// to just end up with one table. To do that, you would just join your match table into
// your output table. The join creates the duplicate rows, one for each month between
// the start month and end month, inclusive.
LEFT JOIN ([Output])
LOAD
"input key"
,"month"
RESIDENT [Match]
;

// Finally, get rid of the junk we've accumulated in the process
DROP TABLES
[Input]
,[All Months for Input Keys]
,[Match]
;

I suppose I should just do some performance testing on the vastly simpler approach you suggested.

johnw · ‎2009-11-02


John Witherspoon wrote: I suppose I should just do some performance testing on the vastly simpler approach you suggested.

Well, it loaded very fast with the WHILE loop. Bah. I just wasted a whole lot of time on a complicated solution to a non-problem based on some bad assumptions about performance. Ah, well. At least now I know not to fear WHILEs, and my script is much cleaner. Thanks, Oleg.

I'll let my solution stand as an example of what NOT to do.

Oleg_Troyansky · ‎2009-11-03

John,

in my experience, WHILE performs very well. Glad I could help.

Oleg Troyansky
Call me for expert support, performance tuning, advanced training and more!

Smail_Barat · ‎2009-11-03

Thank you Jhon and Oleg for the answers
I modified my script and effectively while loop gives superb results very fast than For loop,
my new script looks like:

//=================================================
// THE INPUT DATA TABLE
//=================================================
DOSS_INPUT: // The original table is an Oracle Table with 100000 records, this inline table is used to replace the step one in the INPUT TAB

load * inline [

COMPTE,DOSSIER,DATE_OUVERTURE,PREMIERE_ECHEANCE,NBRE_ECHEANCE,FREQUENCE,MONTANT_ECHEANCE,NOMINAL_DOSSIER,TAUX_INTERET,DUREE_PRET,DIFFERE,MOIS_DIFFERE
30001,30001,'05/10/2009','01/12/2009',96,'M',3611,94,250000,5,96,'',10
30002,30002,'05/10/2009','01/12/2009',96,'M',3611,94,250000,5,96,'',10
30003,30003,'05/10/2009','01/12/2009',96,'M',3611,94,250000,5,96,'',10
30004,30004,'05/10/2009','01/12/2009',96,'M',3611,94,250000,5,96,'',10
30005,30005,'05/10/2009','01/12/2009',96,'M',3611,94,250000,5,96,'',10
30006,30006,'05/10/2009','01/12/2009',96,'M',3611,94,250000,5,96,'',10
30007,30007,'05/10/2009','01/12/2009',96,'M',3611,94,250000,5,96,'',10
30008,30008,'05/10/2009','01/12/2009',96,'M',3611,94,250000,5,96,'',10
30009,30009,'05/10/2009','01/12/2009',96,'M',3611,94,250000,5,96,'',10
30010,30010,'05/10/2009','01/12/2009',96,'M',3611,94,250000,5,96,'',10
30011,30011,'05/10/2009','01/12/2009',96,'M',3611,94,250000,5,96,'',10
30012,30012,'05/10/2009','01/12/2009',96,'M',3611,94,250000,5,96,'',10
30013,30013,'05/10/2009','01/12/2009',96,'M',3611,94,250000,5,96,'',10
30014,30014,'05/10/2009','01/12/2009',96,'M',3611,94,250000,5,96,'',10
30015,30015,'05/10/2009','01/12/2009',96,'M',3611,94,250000,5,96,'',10
30016,30016,'05/10/2009','01/12/2009',96,'M',3611,94,250000,5,96,'',10
30017,30017,'05/10/2009','01/12/2009',96,'M',3611,94,250000,5,96,'',10
30018,30018,'05/10/2009','01/12/2009',96,'M',3611,94,250000,5,96,'',10
30019,30019,'05/10/2009','01/12/2009',96,'M',3611,94,250000,5,96,'',10
30020,30020,'05/10/2009','01/12/2009',96,'M',3611,94,250000,5,96,'',10
30021,30021,'05/10/2009','01/12/2009',96,'M',3611,94,250000,5,96,'',10
30022,30022,'05/10/2009','01/12/2009',96,'M',3611,94,250000,5,96,'',10
30023,30023,'05/10/2009','01/12/2009',96,'M',3611,94,250000,5,96,'',10
30024,30024,'05/10/2009','01/12/2009',96,'M',3611,94,250000,5,96,'',10
30025,30025,'05/10/2009','01/12/2009',96,'M',3611,94,250000,5,96,'',10
30026,30026,'05/10/2009','01/12/2009',96,'M',3611,94,250000,5,96,'',10
30027,30027,'05/10/2009','01/12/2009',96,'M',3611,94,250000,5,96,'',10
30028,30028,'05/10/2009','01/12/2009',96,'M',3611,94,250000,5,96,'',10
30029,30029,'05/10/2009','01/12/2009',96,'M',3611,94,250000,5,96,'',10
30030,30030,'05/10/2009','01/12/2009',96,'M',3611,94,250000,5,96,'',10

] ;

//=================================================
// THE INPUT DATA TEMPORARY TABLE
//=================================================

DOSSIER_INPUT:
LOAD COMPTE,
DOSSIER,
date(DATE_OUVERTURE) AS DATE_OUVERTURE,
date(PREMIERE_ECHEANCE) AS PREMIERE_ECHEANCE,
NBRE_ECHEANCE,
FREQUENCE,
num(MONTANT_ECHEANCE) AS MONTANT_ECHEANCE,
num(NOMINAL_DOSSIER) AS NOMINAL_DOSSIER,
num(TAUX_INTERET) AS TAUX_INTERET, // renaming added by Delphine
Num(DUREE_PRET) AS DUREE_PRET, // renaming added by Delphine
DIFFERE,
MOIS_DIFFERE,
autonumber (COMPTE&'-' & DOSSIER) as Key
RESIDENT DOSS_INPUT;

//====================================================
// DROP NO LONGER NEEDED TABLES
//====================================================

DROP TABLE DOSS_INPUT;

//=================================================
// THE INPUT DATA PROCESSING TABLE
//=================================================

FOR RecordNummer = 0 TO (NoOfRows('DOSSIER_INPUT'))-1

LET vKey = peek('Key', $(RecordNummer), 'DOSSIER_INPUT');
LET vNBRE_ECHEANCE = peek('NBRE_ECHEANCE', $(RecordNummer), 'DOSSIER_INPUT');

DOSSIER_OUTPUT:
LOAD Key,
AddMonths(PREMIERE_ECHEANCE, IterNo() - 1) AS DATE,
IterNo() as ORDER_ECHEANCE
RESIDENT DOSSIER_INPUT
WHILE IterNo() < '$(vNBRE_ECHEANCE)' +1;
NEXT

//=================================================

Thank you again Oleg

Smail_Barat · ‎2009-11-03

I still need to add some other calculations to the new generated table (in the load statement) is there any suggestion?,

The new calculation variables:

LET vINTERET = num(peek('TAUX', $(RecordNummer), 'DOSSIER_INPUT')) * num($(vRESTANT_DU));
LET vCAPITAL_REM = num(peek('MONTANT_ECHEANCE', $(RecordNummer), 'DOSSIER_INPUT')) - num($(vINTERET));

Num('$(vINTERET)', '# ###,00') AS INTERET,
Num('$(vRESTANT_DU)', '# ###,00') AS RESTANT_DU,
Num('$(vCAPITAL_REM)','# ###,00') AS PRINCIPAL

LET vRESTANT_DU = num($(vRESTANT_DU)) - num($(vCAPITAL_REM));