I would like to know how do you handle and manage large QVWs (>1GB) if they should be updated more than once a day.
We do have 3 tiered layer
1) QVD 1:1 (3 GB) - join historic and new data into one table - 25 Minutes
2) QVD with certain joins, logic ... (4 GB) - not an optimized qvd load caused by calculations - 55 Minutes
3) QVW loads everything from 2nd QVD layer in Quality Layer (1 GB) - join several tables, many of them as qvd optimized - 17 Min
4) QVW has to be published via publisher into access point (1 GB) - 5 Min
If just one new record comes from the data source, then first and second layer, both with 10 Million records have to be refreshed.
Finally QVW has to be re-processed from scratch. Afterwards QVW has to be saved on HDD and then copied into another folder as productive.
Is there any other method to do this faster and more efficient?? - without dropping any fields in the qvd table ...
Thank you very much in advance for any inputs.
Are you doing full reload?For such huge data, partial load is preferable always.maximum 100000 or 200000 records.else your application(QVW) ll freeze which used to occur.
Have a QVD generator to refresh frequently as the new record gets loaded into ETL/DB, Create your application or QVW to do full reload from your QVD. This will take less time infact.
is it indeed possible to do incremental load for qvW (application)?
I hope for QlikView it is not a huge data, since QV is able to report on big data ...
My question again: Do you see with certain tricks a possibility to reduce loading time for processes 1 to 4 (above), which take now 25 + 55 + 17 + 5 Minutes? (into let's say 20-30 Minutes all together?) extraction, loading, transformation, application, publishing?
1-Do not load any field which is not used in your reports.
2- Remove all synthetic keys,
3- Try to map your fields with numeric keys instead of strings.
4-Recheck your joins, if left or right joins can be used, use them instead of outer joins/concatenates
In your case best option is INCREAMENTAL LOAD ONLY - which reduces unwanted data refresh and considers only DELTA - means
1. ONLY NEWLY INSERTED RECORDS AND
2. UPDATED RECORDS -
from your DW.
For implementing increamental load you must have CREATED_DATE AND UPDATE_DATE IN YOUR ETL logic.
If you have these fields already in place just do it and test. If you requires any input for this implementation just let me know i will help you for the same.
A part from this, consider following points while loading data
1. Remove Synthetic keys; User AUTONUMBER function
2. Try to have optimize load
3. drop temp. tables from script.
4. do calculations at script level ; it will give more performance while report rendering time.
Hope this help.
thank you for your input. We are intending to use QlikView instead of / independent on a DW. Our DW for QV is just QVD.
Let's say we will manage to identify new records only for QVD. It is not clear to me, how to use the old QVW with old data and just add/append new lines into it in order to reduce loading and calculation time.
Some kind load binary old qvw and add new lines (or even modified) - something similar like SAP BW does the interaction between DSO and InfoCube.
One thing is, if its possible try to build a incremental loading concept. Otherwise you may follow below.
For example if Extraction Process QVW is taking the 45Mins and it's having may be 20 Tables to pull and convert into QVD's. Try to create a three QVW's for the extraction process like small size tables as one, medium size tables as one and large size will be one and then create the batch files,trigger it the three QVW's at a time same time with the 1mins of diffrence,
For the Transformation process also you can follow the same method. May be it can help.
I like your idea, this could be indeed in a loading process good practise how to create more qvd.tables at the same time.
But again my whish is to reuse the old definition of qvw and add just new records without reprocessing everything what has been processed before.
Like in my poor drawing