Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi,
I am searching any tip to reduce reload time maximising CPU usage:
We have built a complex script that calculates payroll data, based on an activity table.
We have aproximately 9000 rows in the activity table, and loop within a for/next script doing a peek of the field values of each row, row by row.
For each source row, we calculate a target row, wrote with an autogenerate command.
The script lasts some 30 minutes. The peek function is what seems to be time consumming (if we replace the autogenerate function by a simple trace on the log, no change in the delay).
We have run the same logic on excel/vba, and it goes through the 9000 lines in a few minutes.
Testing this process on a dualcore processor (hyperthreading desactivated), we only reach a 50% CPU usage during the for/next phase.
Any tip on how to make more efficient CPU usage to accelerate the loop?
Thanks in advance!
James
If you use Peek you're probably doing an order by also, sorting isn't an operation that can be parallelized.
Also make sure you read this tips http://community.qlik.com/docs/DOC-3503.
You can also post your code and see if anyone has an idea to optimize it.
Hi,
Thanks for your tips.
My code as the structure above.
I don't see any special function in my loop. Just playing with field values, + a couple of variable storing previous row info, etc.
The performance tests we made are the following:
-TEST1: run loop script on virtualized production server (VMware). 1h15
-TEST2: run loop script on a laptop (i3 double core), moving full environment on laptop, using hyperthreading. 37 min
-TEST3: run loop script on laptop, full environment on laptop, desactivating hyperthreading. 38 min
-TEST4: run loop script on laptop, using server environment (qvw, qvd stored in server). 1h15
having log activated or not does not impact performance.
-> my virtualized server is currently less efficient that a i3 Laptop
-> I don't understand what makes the process slower if ran on the laptop from qvw stored in the server.
Isn't QlikView supposed to work in ram, so that once the qvw is in ram and source qvd loaded,
the process should only work locally, resulting Test4 in similar performance than Test2 or Test3?
My loop process have an average CPU usage of 25%
Thanks in advance for any help!!
James
---------------------------------------------------------------------------
------------------Code structure-----------------------------------
Sub WriteTargetRow (Field1,Field2,...,Field12)
TargetTable:
LOAD
'$(Field1)' as TargetField1,
...
'$(Field12)' as TargetField12
AUTOGENERATE 1;
END Sub;
FOR vRowNb = 0 to $(vNbOfRowInSourceTable)
vField1 = Text(Peek('Field1',vRowNb ,'Sourcetable'))
...
vField12 = Text(Peek('Field12',vRowNb ,'Sourcetable'))
IF vField1 = Text('HeureFerieNT') THEN
vTargetField2 = 'HNF';
vTargetField3 = '$(vField3)';
CALL WriteTargetRow ('$(vTargetField1)','$(vTargetField2)',.....'$(vTargetField12)')
ELSE
other similar tests on SourceFields values in indented IF /ELSE
ENDIF
vRowNb=vRowNb + 1
NEXT
Your loop will most definetely not be parallelized.
You should into a way to do the same through scripting, joining tables, etc. instead of looping through all rows.
OK,
I guess I won't be able to set the script without a peek logic.
I might focus on administrating my process so that it can be executed partially to give more flexibility to the main user.
In this case, i also have interest in getting a dedicated small server with few cpu, but fast ones, to run my reloads.
Any idea on how to explain performance difference between my tests 3 and 4?
James
If you're reading and writing information through the network you will lose time transferring the data in and out. Or you might have network setup issue where a network card is set to 100 and not 1000 mbps or something like that.