Solved: Best Data Model - Thoughts? - Qlik Community

Report Inappropriate Content · ‎2015-09-21

So like a lot of businesses, we have millions of rows, (circa 20). At present, all of the data gets put into one Fact Table, not keyed, has 170 columns and is slow. My question is this - is it better to have one fact table with 170 columns or split it out into a star schema?

tresesco · ‎2015-09-21

Though the NULL itself would not cause a big problem, but the other factors caused it's creation would. More nulls means information density is less and more number of refrences to a single value in terms of memory pointer. Memory pointers would require to point and store the same value numerous times and that becomes a performance killer.

View solution in original post

tresesco · ‎2015-09-21

That depends. If the table is mostly populated (information density is high, i.e - less number of nulls) the single table would be a better choice. If the table is actually derived from multiple tables and has got lots of nulls (because of joins or concatenations), might be you have to look back and think of breaking it down.

Report Inappropriate Content · ‎2015-09-21

Any chance you could expand on why this is please?

Gysbert_Wassenaar · ‎2015-09-21

Perhaps this blog post helps: A Myth about the Number of Hops.

Where do you experience slow performance? Possible the data model isn't the problem but a chart is. Or one or more expressions that are not optimal for their purpose.

talk is cheap, supply exceeds demand

tresesco · ‎2015-09-21

If you go with many-table way, it gives you normalized model which would have advantages of:

Memory Efficient - (single table would consume much memory)
More meaningful when look at the model - you can understand and manage easily
Risk of wrong calculation reduces - that could appear in the process of joins(improper)

But this is not a such straight-forward case of winning over. There would always be a trade-off between CPU efficiency and Memory efficiency, and de-normalized approach(single table) wins by a small margin there.

Report Inappropriate Content · ‎2015-09-21

but more specifically where NULL values are concerned? a normalised model would win over a one table fact model because ...

tresesco · ‎2015-09-21

Though the NULL itself would not cause a big problem, but the other factors caused it's creation would. More nulls means information density is less and more number of refrences to a single value in terms of memory pointer. Memory pointers would require to point and store the same value numerous times and that becomes a performance killer.