Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hello everybody!
I have a conceptual question considering the WHERE-clause in load script. How does Qlik process this statement?
Simple example:
Load *
Resident Table1
Where 1 = 1
and Year(Date) = 2016
and Activity = 1
;
How does Qlik process this statement?
I see 2 possible approaches.
1. a) take Table1
b) take all the rows in Table1 with year = 2016 (let us call it subset1)
c) take all the rows in subset1 with Activity = 1
2. a) take Table1
b) check if the row has year = 2016
c) if step b is true, check if the row has activity = 1
d) process next row
I think right understanding of this process can be very useful in performance increasing.
Have a nice day!
Elena
I don't think that a subset-query would speed up the process and how should qlikview know which of the conditions would return a TRUE to decide the order in which the queries and sub-queries should be executed - maybe simply from left to the right within the statement but then it would be the knowledge/guessing of the user which order would perform better.
- Marcus
Hi Elena,
Interested in your question from a practical point of view.
Experiment, what code structure in practice it is faster in QlikView from
1) Where YearDate = '2016' And Activity = 1;
2) Consecutive call
Where YearDate = '2016';
Activity = 1;
Create a file QVD.
Table1:
LOAD
RecNo() as ID,
If(Div(RecNo(), 10000)-RecNo()/10000 = 0, 2016, 2015) as YearDate, //create 100 values in 2016, the rest in 2015
If(Div(RecNo(), 10000)-RecNo()/10000 = 0, If(rand()>0.5, 1, 0), 2) as Activity //half of 2016 assign values 0, 1 half, and assign 2 2015
AutoGenerate 50000000;
Store * from Table1 into F:/xyz.qvd;
File with 50 million records containing
Table from QVD of the first full load, and then begin to work with the selection on the basis of 'WHERE'. This allows you to test how the 'WHERE' with the data already loaded in QlikView.
Load the first variant
Table1:
LOAD ID,
YearDate,
Activity
FROM
(qvd);
Let Start = Now();
NoConcatenate
Table2:
LOAD
ID as ID1,
YearDate as YearDate1,
Activity as Activity1
Resident Table1
Where YearDate = '2016' And Activity = 1;
Let Finish = Now();
As can be seen, the load time of the internal table was 17 seconds.
Loading the same data as in the second variant
Table1:
LOAD ID,
YearDate,
Activity
FROM
(qvd);
Let Start = Now();
NoConcatenate
Table3:
LOAD
ID as ID3,
YearDate as YearDate3,
Activity as Activity3
Resident Table1
Where YearDate = '2016';
NoConcatenate
Table2:
LOAD
ID3 as ID2,
YearDate3 as YearDate2,
Activity3 as Activity2
Resident Table3
Where Activity3 = 1;
Let Finish = Now();
As can be seen, the load time of the internal table was 18 seconds. The duration of load increased by 1 second.
Consecutive five downloads are always given difference 1-2 seconds, which is 5-10% load time.
Logically, the real difference is the internal mechanism QlikView should not be more than.
If the load reduction of 5-10% is important, the issue can be studied further. If this does not help, it is necessary to look for another way out.
I would be glad if something helped you.
Regards
Andrey