Skip to main content
Announcements
Have questions about Qlik Connect? Join us live on April 10th, at 11 AM ET: SIGN UP NOW
cancel
Showing results for 
Search instead for 
Did you mean: 
dselgo_eidex
Partner - Creator III
Partner - Creator III

Data Model Optimization

Hello, I have a question regarding optimization of my Qlik Sense data model. Let's say that I have the following tables:

Table A

TestMeasure 1
A1
A2
A3

 

Table B

TestMeasure 1Measure 2
B450
B560
B670

 

Table C

TestMeasure 3
C10
C15
C45

 

All of these tables have the same primary key (Test) but will not always have the same measures. I have another table linking these together by the Test key.

Test Dimension Table

Test
A
B
C

 

In my app, I am creating visualizations that will use measures from each of these Tests, but often not at the same time. I am dealing with a large volume of data, so I am trying to optimize performance and reduce chart calculation time. I have a few questions regarding how Qlik evaluates expressions:

  1. Would it be optimal to leave each of these as separate tables in order to reduce the number of rows being aggregated over in the chart expressions? Or does it not matter because they are all linked by the Test Dimension table anyways?
  2. Would it be optimal to concatenate these three tables into a single table? It will reduce complexity, but I want to know if it will decrease performance as well.

Single Table

TestMeasure 1Measure 2Measure 3
A1  
A2  
A3  
B450 
B560 
B670 
C  10
C  15
C  45

 

Labels (3)
1 Solution

Accepted Solutions
dplr-rn
Partner - Master III
Partner - Master III

Performance depends on number of hops between different tables which is needed to calculate something.

So in your example if same measure is between more than 2 tables an aggregation will have hop between tables.

More hops more expensive the operation becomes.

 

Hence why star schema is preferable to snow flake.

Ideal scenario for Qlik is a single table but that's less flexible and takes more memory.

Performance optimization is generally a quest to find the best performing option with max possible flexibility

View solution in original post

6 Replies
dplr-rn
Partner - Master III
Partner - Master III

i am assuming the number of tables are much more than the example you provided
I would say single table would give best performance.
dselgo_eidex
Partner - Creator III
Partner - Creator III
Author

Yes, there are more and many more measures than in the example above (about 5-6 unique measures per table). Would you be able to elaborate on why you think that would give the best performance? I'm really trying to find out how it would make a difference within chart expressions.

dplr-rn
Partner - Master III
Partner - Master III

Performance depends on number of hops between different tables which is needed to calculate something.

So in your example if same measure is between more than 2 tables an aggregation will have hop between tables.

More hops more expensive the operation becomes.

 

Hence why star schema is preferable to snow flake.

Ideal scenario for Qlik is a single table but that's less flexible and takes more memory.

Performance optimization is generally a quest to find the best performing option with max possible flexibility

dselgo_eidex
Partner - Creator III
Partner - Creator III
Author

Gotcha. So just to confirm, even if in the above example Table A has 50 million rows and Table C has 500,000 rows, it wouldn't make any difference in performance if you were to do avg([Measure 3]) with a single table containing 50+ million rows than with a separated table containing 500,000 rows?

dplr-rn
Partner - Master III
Partner - Master III

I don't believe so. Not 100% though.

From a modeling point if view with info I have I think single table is the best approach
krishna_2644
Specialist III
Specialist III

Concatenate all three into 1 yields faster performance.

so you are good with what you have done.