Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hello, I have a question regarding optimization of my Qlik Sense data model. Let's say that I have the following tables:
Table A
Test | Measure 1 |
A | 1 |
A | 2 |
A | 3 |
Table B
Test | Measure 1 | Measure 2 |
B | 4 | 50 |
B | 5 | 60 |
B | 6 | 70 |
Table C
Test | Measure 3 |
C | 10 |
C | 15 |
C | 45 |
All of these tables have the same primary key (Test) but will not always have the same measures. I have another table linking these together by the Test key.
Test Dimension Table
Test |
A |
B |
C |
In my app, I am creating visualizations that will use measures from each of these Tests, but often not at the same time. I am dealing with a large volume of data, so I am trying to optimize performance and reduce chart calculation time. I have a few questions regarding how Qlik evaluates expressions:
Single Table
Test | Measure 1 | Measure 2 | Measure 3 |
A | 1 | ||
A | 2 | ||
A | 3 | ||
B | 4 | 50 | |
B | 5 | 60 | |
B | 6 | 70 | |
C | 10 | ||
C | 15 | ||
C | 45 |
Performance depends on number of hops between different tables which is needed to calculate something.
So in your example if same measure is between more than 2 tables an aggregation will have hop between tables.
More hops more expensive the operation becomes.
Hence why star schema is preferable to snow flake.
Ideal scenario for Qlik is a single table but that's less flexible and takes more memory.
Performance optimization is generally a quest to find the best performing option with max possible flexibility
Yes, there are more and many more measures than in the example above (about 5-6 unique measures per table). Would you be able to elaborate on why you think that would give the best performance? I'm really trying to find out how it would make a difference within chart expressions.
Performance depends on number of hops between different tables which is needed to calculate something.
So in your example if same measure is between more than 2 tables an aggregation will have hop between tables.
More hops more expensive the operation becomes.
Hence why star schema is preferable to snow flake.
Ideal scenario for Qlik is a single table but that's less flexible and takes more memory.
Performance optimization is generally a quest to find the best performing option with max possible flexibility
Gotcha. So just to confirm, even if in the above example Table A has 50 million rows and Table C has 500,000 rows, it wouldn't make any difference in performance if you were to do avg([Measure 3]) with a single table containing 50+ million rows than with a separated table containing 500,000 rows?
Concatenate all three into 1 yields faster performance.
so you are good with what you have done.