Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 
ThePewSpecialist
Partner - Contributor
Partner - Contributor

When does one pick a snowflake model over joining redundant data in the data load editor?

At what point would one consider joining redundant data onto a dimension table over a snowflake model?

Let's say we have a sales table, this sales table has a product table with product ID, description, product status and warehouse ID.

Product ID Description Product Status Warehouse ID
0001 Bottle of water Available w1

 

This bottle of water also has a warehouse connected to it.

Warehouse ID Warehouse Location Warehouse Shelf Warehouse Category
w1 France 111 Consumables

 

At what point would you no longer join warehouse to product table as one warehouse could contain hundreds of products and in turn create a lot of duplicate data. As I understand Qlik will have to calculate this join whenever you're using it on the front. However does this mean we always pre-join such things or is there a point where we'd not do this?

Should we instead add the Product ID to the warehouse table and connect it directly to the sales table?

Labels (1)
1 Solution

Accepted Solutions
marcus_sommer

That data are redundant is usually not a problem within Qlik because only distinct values are stored within the system-table (for each field one) and connected with a bit-stuffed index to the data-tables - at least from storing/performance point of view. How suitable the relation-ship between tables fits to the view requirements is a different matter and couldn't be said in general. It always depends ...

Beside this it's often helpful to specialize the applications. This could mean to create a sales-analysis with just a few stock-information and creating a stock-analysis which contains the sales as one part of the product-movements but not trying to combine complex analysis from various areas.

View solution in original post

1 Reply
marcus_sommer

That data are redundant is usually not a problem within Qlik because only distinct values are stored within the system-table (for each field one) and connected with a bit-stuffed index to the data-tables - at least from storing/performance point of view. How suitable the relation-ship between tables fits to the view requirements is a different matter and couldn't be said in general. It always depends ...

Beside this it's often helpful to specialize the applications. This could mean to create a sales-analysis with just a few stock-information and creating a stock-analysis which contains the sales as one part of the product-movements but not trying to combine complex analysis from various areas.