Skip to main content
Announcements
Qlik Introduces a New Era of Visualization! READ ALL ABOUT IT
cancel
Showing results for 
Search instead for 
Did you mean: 
tknagaraj
Contributor
Contributor

Loading reference data with huge volume in model

I have performance issue, where there is a big reference data table which need to be loaded in the app. Data is stored in QVD, but to load data it is taking around 15 minutes to load around 60 million rows.

Any good design principle to apply here. 

My model is based on star schema. 

Any pointers will help. 

 

Labels (1)
6 Replies
marcus_sommer

Make sure that the qvd-data are loaded optimized. This means no transforming is applied to these data - the only exception is single where exists(Key) clause.

Or
MVP
MVP

If loading a QVD takes an excessive amount of time:

* Reduce the size of the QVD by eliminating unnecessary columns, storing data efficiently, etc.

and/or

* Change the load to ensure it is optimized

 

 

tknagaraj
Contributor
Contributor
Author

Thanks for the reply. There are no transformations , it is just a direct load of all columns.  Let me try out exists clause to limit rows. But I have a situation to have a left outer join as well, as shown below.

Table A:  Transaction data

Table B: Reference data linked to Table A. Need only matched rows. 

Table C: Reference data linked to Table B. Need matched rows with Table B, to pick its attributes.

This is like a left outer join situation. But is Left outer join good solution based on performance of Qlik.

Point to note table C is 60 million. Table A 10 million. Table B is 3 million. Final output expected is 10 million rows.

tknagaraj
Contributor
Contributor
Author

Left join seems to be doing the trick. Its loading data very fast. Not sure if it will have any impacts in future. But will stick with it for now.

marcus_sommer

Joins are (quite heavy) transformations. They are in general replaceable with mappings which often perform faster by being more powerful and flexible.

Further using exists() to filter the data directly by loading a source respectively before applying any join/mapping-approach will also save resources and therefore run-times.

tknagaraj
Contributor
Contributor
Author

Thanks. Will try Exists.