Smallest performance footprint between duplicated ... - Qlik Community

MattiasThalén · ‎2023-05-04

Hi!

I have two solutions for a model and am trying to decide on which would have the smallest performance footprint:

[Order Fact (10M): %ItemKey] > [Item Dim (1M): %ProductGroupKey] > [Product Group Dim (100k)]
[Order Fact (10M): %ItemKey] > [Item Dim (1M): %ProductGroupKey]
+ [Order Fact (10M): %ItemKey] > [Product Group Dim (1M)]

Basically it stands between having a snowflake or a star schema, an extra relation, vs duplicating data in [Product Group Dim] so I can join it directly to [Order Fact].

My gut tells my that having that option 1 has the smallest footprint on performance.

Chanty4u · ‎2023-05-04

You are correct that having option 1 with a snowflake schema is likely to have a smaller performance footprint than option 2 with a star schema.

In a snowflake schema, the fact table is connected to dimension tables through intermediate tables, resulting in a more normalized data structure. This can help to reduce data redundancy and improve data consistency, which can lead to faster query performance.

Smallest performance footprint between duplicated rows vs extra relation

General Question