I think using WHERE NOT exists is a lot faster than if statements or distinct. For example if you have the following fields:
customer_number & '|' & contract_id as 'key',
You could use something like:
WHERE NOT exists(key,customer_number & '|' & contract_id)
The point being that if whatever "key" you use is already loaded, don't load it, otherwise load it. Here's an example I used before:
customer_number & '|' & status & '|' & status_update_date as 'key',
WHERE NOT exists(key, customer_number & '|' & status & '|' & status_update_date);
I had a file for each day and I only wanted to load rows where the combination of customer number, status, and status update date had never been loaded. When I used LOAD DISTINCT, it took over 5 hours but when I used WHERE NOT eixsts( it took like an hour and a half. Hopefully that helps.
Hi there, if your only criteria for loading new information is timestamp, you may try a similar code to the following:
let vToday = today() - 1;
Load A, B, C, DateTimeField;
select A, B, C DateTimeField
where DateTimefield >= $(vToday);
from respository.qvd (qvd)
where not exists(DateTimeField);
Notice that several things are happening here:
In the first load we are loading yesterday's and today's information, so only fresh data is being pulled from the database. Notice that you can easily store and load your last successful reload date if the today() -1 approach does not work for you. It is also necessary that you know the exact way in which your date time field in the database is formatted, so you can format the today´s date string propperly in QLikView. For example:
sql select *
where DateField >= date '2011-07-27';
In the second load, where are loading the historical data storage in qvd, but only does rows that had not been loaded by the first load statement.
First of all thanks for your help!
@Daniel: I'm still figuring out how i'm gonna script it so I don't have more then small pieces
@Trent&Ivan: I understand the performance improvement for certain cases but it still requires going trough all records. It doesn't actually limit the number of records you load to determine wether a record is new/old/existing
Right now I'm looking into the SQL select to limit the amount of data which is actually pulled into the script. In the normal configuration I would have something like below where a record is loaded depending on a certain treshold.
WHERE BookInventoryID < 50000;
SQL SELECT BookInventoryID
What I'm looking for is a limitation of the data set in the SQL select. If I order my records like shown below. I want to go trough the records until the BookInventoryID reaches the treshold and then terminate the load. However I don't no if it is possible to incorporate the while statement (It's not working this way ) Any thoughts?
SQL SELECT BookInventoryID
WHILE BookInventoryID < 50000
ORDER BY BookInventoryID ASC;