Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

Performance enhancement using QVD incremental load

The advantage of QVD is obviously the speed at which data can be loaded into the QVW. In the script I'm building data is loaded overnight and only data elements which are new need to be loaded from the SQL database in the script and added to the QVD. I thought this would lead to a considerable performance enhancement however I still need to check wether the field info is new or old. Right now I'm using an IF statement based on the value of a Timestamp. Do you have any idea how to make a construction to speed up this process as checking all elements and combining the new ones to the QVD doesn't seem faster then loading all elements from the SQL database. Thanks!

1 Solution

Accepted Solutions
Not applicable
Author

Already got the answer on a different post. The trick is to put the Where clause inside the SQL Select. So:

SQL SELECT *

FROM DPAR.dbo.BookInventory

WHERE BookInventoryID < 13000000;

View solution in original post

5 Replies
danielrozental
Master II
Master II

Post your script so we can look into improving it.

Not applicable
Author

I think using WHERE NOT exists is a lot faster than if statements or distinct.  For example if you have the following fields:

customer_number,

contract_id,

customer_number & '|' & contract_id as 'key',

start_date,

end_date,

timestamp

You could use something like:

WHERE NOT exists(key,customer_number & '|' & contract_id)

The point being that if whatever "key" you use is already loaded, don't load it, otherwise load it.  Here's an example I used before:

Table1:

customer_number,

status,

status_update_date,

customer_number & '|' & status & '|' & status_update_date as 'key',

other_stuff,

other_stuff2

FROM wherever

WHERE NOT exists(key, customer_number & '|' & status & '|' & status_update_date);

I had a file for each day and I only wanted to load rows where the combination of customer number, status, and status update date had never been loaded.  When I used LOAD DISTINCT, it took over 5 hours but when I used WHERE NOT eixsts( it took like an hour and a half.  Hopefully that helps.

Not applicable
Author

Hi there, if your only criteria for loading new information is timestamp, you may try a similar code to the following:

let vToday = today() - 1;

NewRows:

Load A, B, C, DateTimeField;

select A, B, C DateTimeField

from Table

where DateTimefield >= $(vToday);

OldRows:

Load *

from respository.qvd (qvd)

where not exists(DateTimeField);

Notice that several things are happening here:

In the first load we are loading yesterday's and today's information, so only fresh data is being pulled from the database. Notice that you can easily store and load your last successful reload date if the today() -1 approach does not work for you. It is also necessary that you know the exact way in which your date time field in the database is formatted, so you can format the today´s date string propperly in QLikView. For example:

Load *;

sql select *

from Table

where  DateField >= date '2011-07-27';

In the second load, where are loading the historical data storage in qvd, but only does rows that had not been loaded by the first load statement.

Regards

Not applicable
Author

First of all thanks for your help!

@Daniel: I'm still figuring out how i'm gonna script it so I don't have more then small pieces

@Trent&Ivan: I understand the performance improvement for certain cases but it still requires going trough all records. It doesn't actually limit the number of records you load to determine wether a record is new/old/existing

Right now I'm looking into the SQL select to limit the amount of data which is actually pulled into the script. In the normal configuration I would have something like below where a record is loaded depending on a certain treshold.

LOAD BookInventoryID

    WHERE BookInventoryID < 50000;

SQL SELECT BookInventoryID

FROM DPAR.dbo.BookInventory

What I'm looking for is a limitation of the data set in the SQL select. If I order my records like shown below. I want to go trough the records until the BookInventoryID reaches the treshold and then terminate the load. However I don't no if it is possible to incorporate the while statement (It's not working this way ) Any thoughts?

LOAD BookInventoryID;

SQL SELECT BookInventoryID

WHILE BookInventoryID < 50000

FROM DPAR.dbo.BookInventory

ORDER BY BookInventoryID ASC;

Not applicable
Author

Already got the answer on a different post. The trick is to put the Where clause inside the SQL Select. So:

SQL SELECT *

FROM DPAR.dbo.BookInventory

WHERE BookInventoryID < 13000000;