Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Qlik Open Lakehouse is Now Generally Available! Discover the key highlights and partner resources here.
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

Processing of WHERE clause

Hello everybody!


I have a conceptual question considering the WHERE-clause in load script. How does Qlik process this statement?

Simple example:

Load *

Resident Table1

Where 1 = 1

          and Year(Date) = 2016

          and Activity = 1

;

How does Qlik process this statement?

I see 2 possible approaches.


1. a) take Table1

    b) take all the rows in Table1 with year = 2016 (let us call it subset1)

    c) take all the rows in subset1 with Activity = 1


2. a) take Table1

    b) check if the row has year = 2016

    c) if step b is true, check if the row has activity = 1

    d) process next row


I think right understanding of this process can be very useful in performance increasing.


Have a nice day!

Elena


11 Replies
marcus_sommer

I don't think that a subset-query would speed up the process and how should qlikview know which of the conditions would return a TRUE to decide the order in which the queries and sub-queries should be executed - maybe simply from left to the right within the statement but then it would be the knowledge/guessing of the user which order would perform better.

- Marcus

ahaahaaha
Partner - Master
Partner - Master

Hi Elena,

Interested in your question from a practical point of view.

Experiment, what code structure in practice it is faster in QlikView from

1) Where YearDate = '2016' And Activity = 1;

2) Consecutive call

    Where YearDate = '2016';

    Activity = 1;

Create a file QVD.

Table1:

LOAD

RecNo() as ID,

If(Div(RecNo(), 10000)-RecNo()/10000 = 0, 2016, 2015) as YearDate, //create 100 values in 2016, the rest in 2015

If(Div(RecNo(), 10000)-RecNo()/10000 = 0, If(rand()>0.5, 1, 0), 2) as Activity //half of 2016 assign values 0, 1 half, and assign 2 2015

AutoGenerate 50000000;

Store * from Table1 into F:/xyz.qvd;

File with 50 million records containing

1.jpg

Table from QVD of the first full load, and then begin to work with the selection on the basis of 'WHERE'. This allows you to test how the 'WHERE' with the data already loaded in QlikView.

Load the first variant

Table1:

LOAD ID,

     YearDate,

     Activity

FROM

(qvd);

Let Start = Now();

NoConcatenate

Table2:

LOAD

     ID as ID1,

     YearDate as YearDate1,

     Activity as Activity1

Resident Table1

Where YearDate = '2016' And Activity = 1;

Let Finish = Now();

2.jpg

As can be seen, the load time of the internal table was 17 seconds.

Loading the same data as in the second variant

Table1:

LOAD ID,

     YearDate,

     Activity

FROM

(qvd);

Let Start = Now();

NoConcatenate

Table3:

LOAD

     ID as ID3,

     YearDate as YearDate3,

     Activity as Activity3

Resident Table1

Where YearDate = '2016';

NoConcatenate

Table2:

LOAD

     ID3 as ID2,

     YearDate3 as YearDate2,

     Activity3 as Activity2

Resident Table3

Where  Activity3 = 1;

Let Finish = Now();

3.jpg

As can be seen, the load time of the internal table was 18 seconds. The duration of load increased by 1 second.


Consecutive five downloads are always given difference 1-2 seconds, which is 5-10% load time.


Logically, the real difference is the internal mechanism QlikView should not be more than.

If the load reduction of 5-10% is important, the issue can be studied further. If this does not help, it is necessary to look for another way out.


I would be glad if something helped you.


Regards

Andrey