Are you doing full reload?For such huge data, partial load is preferable always.maximum 100000 or 200000 records.else your application(QVW) ll freeze which used to occur.
Have a QVD generator to refresh frequently as the new record gets loaded into ETL/DB, Create your application or QVW to do full reload from your QVD. This will take less time infact.
is it indeed possible to do incremental load for qvW (application)?
I hope for QlikView it is not a huge data, since QV is able to report on big data ...
My question again: Do you see with certain tricks a possibility to reduce loading time for processes 1 to 4 (above), which take now 25 + 55 + 17 + 5 Minutes? (into let's say 20-30 Minutes all together?) extraction, loading, transformation, application, publishing?
One thing is, if its possible try to build a incremental loading concept. Otherwise you may follow below.
For example if Extraction Process QVW is taking the 45Mins and it's having may be 20 Tables to pull and convert into QVD's. Try to create a three QVW's for the extraction process like small size tables as one, medium size tables as one and large size will be one and then create the batch files,trigger it the three QVW's at a time same time with the 1mins of diffrence,
For the Transformation process also you can follow the same method. May be it can help.
I like your idea, this could be indeed in a loading process good practise how to create more qvd.tables at the same time.
But again my whish is to reuse the old definition of qvw and add just new records without reprocessing everything what has been processed before.
Like in my poor drawing
In your case best option is INCREAMENTAL LOAD ONLY - which reduces unwanted data refresh and considers only DELTA - means
1. ONLY NEWLY INSERTED RECORDS AND
2. UPDATED RECORDS -
from your DW.
For implementing increamental load you must have CREATED_DATE AND UPDATE_DATE IN YOUR ETL logic.
If you have these fields already in place just do it and test. If you requires any input for this implementation just let me know i will help you for the same.
A part from this, consider following points while loading data
1. Remove Synthetic keys; User AUTONUMBER function
2. Try to have optimize load
3. drop temp. tables from script.
4. do calculations at script level ; it will give more performance while report rendering time.
Hope this help.
thank you for your input. We are intending to use QlikView instead of / independent on a DW. Our DW for QV is just QVD.
Let's say we will manage to identify new records only for QVD. It is not clear to me, how to use the old QVW with old data and just add/append new lines into it in order to reduce loading and calculation time.
Some kind load binary old qvw and add new lines (or even modified) - something similar like SAP BW does the interaction between DSO and InfoCube.
As per your drawing, You are expecting the how to insert the only new data into old QVW/QVD's right?
If that is you are expecting need to implement the Incremental Load Concept.
Example: You have the one QVD name called sales.qvd with the Yesterday data. now you don't want to pull and load again all the data into qvd. for that create a script next to that script.
Step1 : Load the QVD
Step2 : Write the Script for new Data to pull. I mean Query Eg. Select * From sales Where Datamodified=Today()
Step 3: Concatenate both of the tables using Concatenate.
Step4 : replace the sales.qvd with the new data.
Load * From sales.qvd;
Select * from sales where Datemodified=Today()
Store Sales into sales.qvd;
You must ensure that the best practices as have been defined by QlikTech are followed while teh QVW has been created.
You need to check that there are no loops, synthetic keys and the final QlikView schema is a Star or a Snow Flake schema.
Also QlikView is clever enough that it stores the distinct value of a field only once, to save the memory/space/size.
For example : if number 999 occurs million times, it will store only 999 once and then have a pointer the next time thus occupying less bytes. You might need to look at options to break some fields at the QVD layer. Could be have date month and year split, so the space saves.