Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Qlik Open Lakehouse is Now Generally Available! Discover the key highlights and partner resources here.
cancel
Showing results for 
Search instead for 
Did you mean: 
MattiasThalén
Partner - Contributor II
Partner - Contributor II

Smallest performance footprint between duplicated rows vs extra relation

Hi!

I have two solutions for a model and am trying to decide on which would have the smallest performance footprint:

  1. [Order Fact (10M): %ItemKey] > [Item Dim (1M): %ProductGroupKey] > [Product Group Dim (100k)]
  2. [Order Fact (10M): %ItemKey] > [Item Dim (1M): %ProductGroupKey]
    + [Order Fact (10M): %ItemKey] > [Product Group Dim (1M)]

Basically it stands between having a snowflake or a star schema, an extra relation, vs duplicating data in [Product Group Dim] so I can join it directly to [Order Fact].

My gut tells my that having that option 1 has the smallest footprint on performance.

Labels (1)
1 Reply
Chanty4u
MVP
MVP

You are correct that having option 1 with a snowflake schema is likely to have a smaller performance footprint than option 2 with a star schema.

 

In a snowflake schema, the fact table is connected to dimension tables through intermediate tables, resulting in a more normalized data structure. This can help to reduce data redundancy and improve data consistency, which can lead to faster query performance.