Skip to main content
Announcements
Have questions about Qlik Connect? Join us live on April 10th, at 11 AM ET: SIGN UP NOW
cancel
Showing results for 
Search instead for 
Did you mean: 
anni
Partner - Contributor
Partner - Contributor

Incremental Load (New, Changed and Deleted) with a lot of data

Hello

How do you all handle Incremental Load on a big volume of data?
I have followed this guide: https://help.qlik.com/en-US/sense/April2019/Subsystems/Hub/Content/Sense_Hub/LoadData/use-QVD-files-...
Where it says that incremental load with hinsight of insert, update and delete should be done like this:

QV_Table:

SQL SELECT PrimaryKey, X, Y FROM DB_TABLE

WHERE ModificationTime >= #$(LastExecTime)#

AND ModificationTime < #$(ThisExecTime)#;

 

Concatenate LOAD PrimaryKey, X, Y FROM [lib://MyDataFiles/File.QVD]

WHERE NOT EXISTS(PrimaryKey);

 

Inner Join SQL SELECT PrimaryKey FROM DB_TABLE;

 

If ScriptErrorCount = 0 then

STORE QV_Table INTO [lib://MyDataFiles/File.QVD];

Let LastExecTime = ThisExecTime;

End If

Problem is, having several million data records, the inner join fills up RAM on the server immediately and thus breaking almost everything else that runs on the server.

We've tried another method:

Table:
LOAD
"primarykey" as "primarykey"
,"x" as "x"
,"changedate" as "changedate"
;
SELECT * FROM table where changedate>$(LastLoadDate)
;
Temp:
concatenate Table Load * From table.qvd where not exists(PrimaryKey)

Now this works but if there are any changes to a data record, they won't be concatenated with the existing qvd file right? I would need to delete every record that was loaded into Table from the qvd file but haven't found any way to do that (if you know something, feel free to tell me 🙂 ).

How do you all handle stuff like that?

Regards

anni

Labels (3)
3 Replies
Gysbert_Wassenaar

Unless your database has a table that stores which records were deleted from the table DB_TABLE you need to extract all (millions) the primary keys from DB_TABLE and compare those with the primary keys in the qvd.

Instead of an INNER JOIN you could try a RIGHT KEEP with a temporary table with the primary keys from the database and afterwards drop that temporary table.

Perhaps changing the order of doing things can help too:

 

// 1. Load all the primary keys from the database table

tmpKeys: SQL SELECT PrimaryKey FROM DB_TABLE;

// 2. Load all the data from the qvd with a matching primary key, i.e. don't load data with primary keys that have been deleted in the database

QV_Table: LOAD * FROM [lib://MyDataFiles/File.QVD] WHERE Exists(PrimaryKey);

// 3. Store QV_Table into the qvd

STORE QV_Table INTO [lib://MyDataFiles/File.QVD];

// 4. Clean up; Drop QV_Table and tmpKeys

DROP TABLES QV_Table, tmpKeys;

// 5. Load the new records from the database

QV_Table:

SQL SELECT PrimaryKey, X, Y FROM DB_TABLE

WHERE ModificationTime >= #$(LastExecTime)#

AND ModificationTime < #$(ThisExecTime)#;

// 6. Concatenate the data from the qvd

Concatenate LOAD PrimaryKey, X, Y FROM [lib://MyDataFiles/File.QVD]

WHERE NOT EXISTS(PrimaryKey);

// 7. Store the complete table in the qvd

STORE QV_Table INTO [lib://MyDataFiles/File.QVD];


talk is cheap, supply exceeds demand
anni
Partner - Contributor
Partner - Contributor
Author

Thank you so much. I will try this as soon as possible!

Just a very basic question about concatenate. If I load a changed dataset, it will be in my qvd and in my newly loaded data. If I concenate these, will it change the dataset in the qvd? 

I am basically looking for something like T-SQL MERGE. I'm not sure if concatenate is what I am looking for, I personally find the qlik kb article on it a bit confusing. 

Gysbert_Wassenaar

Concatenate will force appending records to an existing table. It's more or less like UNION ALL in SQL. If you concatenate data to a table and then store that table in a qvd then the qvd will contain all the data from that table including the data added with concatenate.


talk is cheap, supply exceeds demand