Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi, Im performing an incremental load.
A part of the script requires a test if new records were found.
For this I add the following test:
LOAD 1 as Dummy
Resident $(table)
where RowNo()=0;
LET vNewRecordsExists = If(peek('Dummy')=1,-1,0);
But, too much time is spent on this test, meaning this test has bad performance.
Any suggestion to improve performance of this test will be appreciated.
Thanks in advance.
Ori
Dear Ori,
I use "not exist" function to get the insert's in a new data set..
At the below, you can see the code that i use.. First of all, i define a variable which holds last execution date.
First load statement, load the data from an excel sheet or rdbms system.
At where condition i get the data of the last reload date.
Then i reload my existing data which i store it as a qvd file.
Then , concetanete the first load statement with the second one by using the "not exist" function.
LET vLASTRELOAD= NUM(RIGHT(DATE(RELOADTIME()),2))&LEFT(DATE(RELOADTIME()),2)&MID(DATE(RELOADTIME()),4,2);
Directory;
LOAD ID,
A,
B,
DATE,
UPDATE_DATE
FROM INC.xls (biff, embedded labels, table is Sheet1$) WHERE $(vLASTRELOAD)=UPDATE_DATE;
CONCATENATE
Directory;
LOAD ID,
A,
B,
DATE,
UPDATE_DATE
FROM INC.QVD (qvd)
WHERE NOT EXISTS(ID); //INSERT
I hope, it gives you an idea about the processes.
BR
Omer
Hi Omer
Thanks for the helpful answer.
But What if I could find a fast way to validate new lines existance,
wouldn't it be faster to skip this Concatenation step when no lines are found?
Hi Ori,
You can validate it with different ways but in the end, you need to insert the new lines to your existing dataset.
as far as i know that, reading a dataset from qvd and concetentation is not very time consuming processes.
in my example, where condition will be a time consuming event, but no other way to understand which row is the inserted other than comparing timestamp fields.
You may wish to use some other specific change-data capture (CDC) tools to get changes less than a second.