Qlik Community

Qlik Design Blog

All about product and Qlik solutions: scripting, data modeling, visual design, extensions, best practices, etc.

Employee
Employee

Overview of Qlik Incremental Loading

be9734868170ef50ac66fcd2b59da561.jpg.pngHey Guys - I have been meaning to do a video on this topic for many years and I finally got around to it! Over the years as I taught myself Qlik, I would dabble with the idea of incremental loading - but never really had a grasp on it as my data sets were always used for demos or they were quite small.....until I started building my Qlik Sense Retro-Gaming Dashboard.



10-23-2018 9-53-56 AM.png


(I use the incremental loading best practice to insert new video game pricing data daily - which allows me to view the pricing history of my collection for "Loose" and "Complete in Box" values). As I was building my app I had a ton of resources available to me but I never really found a decent video on this topic that simply explained what it was or how it is used. I figured I would create something short and sweet with a simplistic example from what I have learned, with hope it helps you understand incremental loading a bit better than I did! There are a few prerequisites you should be aware of, such as understanding what .QVD files are and what there purpose is. As well as basic data model design that include key and date fields which are used to identify new and changed records. I've included some samples files and resources at the end of this post to help.


Let me know what you think and please leave your questions or comments below. Enjoy!


Incremental Load


Regards,

Michael Tarallo (@mtarallo) | Twitter

Qlik

Resources:

Sample Insert Only - Incremental Load Script

//Simple Insert - New Records from Source Data

// Get the last OrderID loaded into the .QVD

MaxKeyLoad:

LOAD Max(OrderID) as MaxID

FROM [lib://IncLoadLocation (desktop-aura45o_mto)/sales_data.qvd] (qvd);

//Set a variable to be used in the SQL

Let MaxID = peek('MaxID',0,MaxKeyLoad);

//Get the latest data from the source where the OrderID in the Source is greater than the orderID in the .QVD

[orders]:

Load *;

SQL SELECT * FROM "Sales"."dbo"."Orders$"

WHERE OrderID > '$(MaxID)';

// Data is automatically concatenated if the data model has the sames fields

// CONCATENATE forces concatenation (in this example it is not necessary)

// Load all data from the .QVD

CONCATENATE

LOAD * from [lib://IncLoadLocation (desktop-aura45o_mto)/sales_data.qvd] (qvd);

//Store the new larger qvd file

// This will store the existing data and the new record and overwrite the existing .qvd

STORE orders into [lib://IncLoadLocation (desktop-aura45o_mto)/sales_data.qvd] (qvd);

// Exits script - used for debugging and troubleshooting - will not execute the next tab.

exit script;

Can't see the video? YouTube blocked in your region or office? - Download the .mp4 to watch on your computer or mobile device.

In case you want to perform this test with MS SQL - I added the SQL Queries and the MS SQL Database .bak file as well.

I also updated this to include a sample script - mikes_script.txt - this shows a simple insert using files as sources and a dropbox connection as the main connector for the inventory file.

12 Comments
marmentrout5
Contributor

Good overview of the process.

1,995 Views

the process in overall is quite straight forward. What i think should be said is that getting last OrderId in Qlik the way it is shown in example may not be the fastest solution and will slow down whole process if you have milions of records you need to load,

I would suggest using solution with:  

          MaxKeyLoad:

          Load

               Max(temp_orderid) as MaxID 

          ;

          Load

               FieldValue('OrderID', IterNo()) as temp_orderid

          Autogenerate

               (1)

          While

               not IsNull(FieldValue('OrderID', IterNo()))

          ;


More about this here MIN AND MAX IN LARGE TABLES


regards

Lech

1,995 Views
erhancopur
New Contributor II

Hello

I did as you stated in your article.

My qvd file loaded 71mb of data to my c drive. Now I can see my graph but when i use the update button extension nearly 200mb of data is received from network and takes like 40 seconds to reload. Isn't it supposed to query for only the updated rows at sql? I think it queries all the data from the begginig. My load script is below. What i did wrong?

MaxIDyukle:

Load Max(ID) as MaxID

From [lib://Qlik_QVD_DATA/Hourly.qvd] (qvd);


Let MaxID = Peek('MaxID',0,MaxIDyukle);


tablo:

Load ID,

    ****,

    DateTime,

    Referans ;

SQL SELECT ID,

    *****,

    DateTime,

    Referans

from "*****.dbo."URT_BASE"

WHERE ID > '$(MaxID)';

Concatenate

Load * from 'lib://Qlik_QVD_DATA/Hourly.qvd' (qvd);

Store tablo into 'lib://Qlik_QVD_DATA/Hourly.qvd' (qvd);

[Tablo$]:


SQL SELECT ID,

    *****,

    DateTime,

    Referans

FROM *****.dbo."URT_BASE";


STORE Tablo$ into [lib://Qlik_QVD_DATA/Hourly.qvd] (qvd);


exit script;

0 Likes
1,995 Views
Employee
Employee

Hi Lech lech_miszkiewicz can you take a look at what Erhan has stated?

0 Likes
1,995 Views

Hi Michael,

There are few unknown in Erhans script:

  • some of the fields in above script are replaced with ***
  • tablo & [Tablo$] tables will likely autoconcatenate and they both are stored afterwards as Hourly.qvd - Why?
    • from the script above it looks like table is loaded twice
      • 1st time when is using this for incremental (as table called tablo - this is where in my opinion process should end)
      • 2nd time when is fully loaded again (as [Tablo$] table - question is why or what for?)

To me this script does not look right as Erhans is loading data from 2 sources and is overwriting the same table.

cheers

Lech

0 Likes
1,995 Views
erhancopur
New Contributor II

Hello and thanks for helping

I wrote this script from the articles i found from web and it is the first time i did this, it most probably has mistakes. My aim here is to load only the updated rows from the sql database to the qvd file, while doing this i don't want qlik sense to query all the data in the sql database every time i click reload button but just the updated ones(reason is to take less time while updating the data and not too much loading on sql).

In order to do this as i understand from what i read First we need to get the latest ID from the Qvd file and assigning it to a variable. then compare this to the actual latest ID column in sql database and update the ones after this.

I use the reload extension from qlik branch.

i just didn't wanted to write the database and column name so changed them to **.

Hope I am clear.  I can try other scripts you will suggest. Because now everytime i update, lots of network traffic will happen.

0 Likes
1,995 Views

to cut story short,

tablo:

Load ID,

    ****,

    DateTime,

    Referans ;

SQL SELECT ID,

    *****,

    DateTime,

    Referans

from "*****.dbo."URT_BASE"

WHERE ID > '$(MaxID)';


Concatenate

Load * from 'lib://Qlik_QVD_DATA/Hourly.qvd' (qvd);

Store tablo into 'lib://Qlik_QVD_DATA/Hourly.qvd' (qvd);

Exit Script;

// ---------------------------------------------------------------

This is where i would exit script; as rest of it is not required!!!

you don't need below part - effectively you are duplicating records by loading everything once again from SQL!

[Tablo$]:

SQL SELECT ID,

    *****,

    DateTime,

    Referans

FROM *****.dbo."URT_BASE";


STORE Tablo$ into [lib://Qlik_QVD_DATA/Hourly.qvd] (qvd);


exit script;

0 Likes
1,995 Views

one more comment about the size of your file:

Since you were loading the same data twice it was likely auto-concatenating so your QVD file became twice as big as originally (or maybe even more than twice)

regards

Lech

0 Likes
1,995 Views
georgiohb
New Contributor

Thanks for the info, but how can this be achieved for loading New CSVs or Excel files  ( NO SQL Statements ) ?

0 Likes
1,995 Views
Employee
Employee

Hi Georgio - I do this with my Retrogaming Dashboard:

Here is an example - please note this can be done a few ways, I am still considered a novice when it comes to incremental load - but this has been running for me over a year now - it just covers the insert scenario - inserting new records daily.

Mike's Qlik Sense Retrogaming Collection

Please look at the blog attachments - I added an example script (mikes_script.txt) of the one that I use for my incremental load.

Hope this helps

Let me know how you do.

Regards,

Mike T

Qlik

0 Likes
1,995 Views

My initial job making questionnaire. How this functionality works?

Let MaxID = peek('MaxID',0,MaxKeyLoad); 


[orders]: 

Load *; 

SQL SELECT * FROM "Sales"."dbo"."Orders$" 

WHERE OrderID > '$(MaxID)';


We are calling Variable into SQL script. How Qlik functionality used for SQL? Not sure, Is that really works?

0 Likes
1,995 Views

Hi Anil

this is very basic... i will break it into individual segments

  • Let MaxID creates MaxID variable called MaxID
  • peek('MaxID',0,MaxKeyLoad) is a Qlik function which should return:
    • Peek function searches MaxID field in row 0 in Table MaxKeyLoad and allocates found VALUE  to your MaxID variable (for explanation purpose lets assume that MaxId = 10045
  • MaxID variable can be then passed as a value into SQL statement.

Above example assumes that OrderID is an increment and each new order get higher/bigger value

We are not aclling variable into SQL script - we are passing variable value into SQL where clause, which basicly means that instead $(MaxID) there is a VALUE passed through... so what gets passed to SQL is actually:

Load *; 

SQL SELECT * FROM "Sales"."dbo"."Orders$" 

WHERE OrderID > '10045';

will i pass your test Anil? 🙂

cheers

Lech

0 Likes
1,995 Views