Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hello
How do you all handle Incremental Load on a big volume of data?
I have followed this guide: https://help.qlik.com/en-US/sense/April2019/Subsystems/Hub/Content/Sense_Hub/LoadData/use-QVD-files-...
Where it says that incremental load with hinsight of insert, update and delete should be done like this:
QV_Table:
SQL SELECT PrimaryKey, X, Y FROM DB_TABLE
WHERE ModificationTime >= #$(LastExecTime)#
AND ModificationTime < #$(ThisExecTime)#;
Concatenate LOAD PrimaryKey, X, Y FROM [lib://MyDataFiles/File.QVD]
WHERE NOT EXISTS(PrimaryKey);
Inner Join SQL SELECT PrimaryKey FROM DB_TABLE;
If ScriptErrorCount = 0 then
STORE QV_Table INTO [lib://MyDataFiles/File.QVD];
Let LastExecTime = ThisExecTime;
End If
Problem is, having several million data records, the inner join fills up RAM on the server immediately and thus breaking almost everything else that runs on the server.
We've tried another method:
Table:
LOAD
"primarykey" as "primarykey"
,"x" as "x"
,"changedate" as "changedate"
;
SELECT * FROM table where changedate>$(LastLoadDate)
;
Temp:
concatenate Table Load * From table.qvd where not exists(PrimaryKey)
Now this works but if there are any changes to a data record, they won't be concatenated with the existing qvd file right? I would need to delete every record that was loaded into Table from the qvd file but haven't found any way to do that (if you know something, feel free to tell me 🙂 ).
How do you all handle stuff like that?
Regards
anni
Unless your database has a table that stores which records were deleted from the table DB_TABLE you need to extract all (millions) the primary keys from DB_TABLE and compare those with the primary keys in the qvd.
Instead of an INNER JOIN you could try a RIGHT KEEP with a temporary table with the primary keys from the database and afterwards drop that temporary table.
Perhaps changing the order of doing things can help too:
// 1. Load all the primary keys from the database table
tmpKeys: SQL SELECT PrimaryKey FROM DB_TABLE;
// 2. Load all the data from the qvd with a matching primary key, i.e. don't load data with primary keys that have been deleted in the database
QV_Table: LOAD * FROM [lib://MyDataFiles/File.QVD] WHERE Exists(PrimaryKey);
// 3. Store QV_Table into the qvd
STORE QV_Table INTO [lib://MyDataFiles/File.QVD];
// 4. Clean up; Drop QV_Table and tmpKeys
DROP TABLES QV_Table, tmpKeys;
// 5. Load the new records from the database
QV_Table:
SQL SELECT PrimaryKey, X, Y FROM DB_TABLE
WHERE ModificationTime >= #$(LastExecTime)#
AND ModificationTime < #$(ThisExecTime)#;
// 6. Concatenate the data from the qvd
Concatenate LOAD PrimaryKey, X, Y FROM [lib://MyDataFiles/File.QVD]
WHERE NOT EXISTS(PrimaryKey);
// 7. Store the complete table in the qvd
STORE QV_Table INTO [lib://MyDataFiles/File.QVD];
Thank you so much. I will try this as soon as possible!
Just a very basic question about concatenate. If I load a changed dataset, it will be in my qvd and in my newly loaded data. If I concenate these, will it change the dataset in the qvd?
I am basically looking for something like T-SQL MERGE. I'm not sure if concatenate is what I am looking for, I personally find the qlik kb article on it a bit confusing.
Concatenate will force appending records to an existing table. It's more or less like UNION ALL in SQL. If you concatenate data to a table and then store that table in a qvd then the qvd will contain all the data from that table including the data added with concatenate.