Skip to main content
Announcements
Qlik Connect 2025: 3 days of full immersion in data, analytics, and AI. May 13-15 | Orlando, FL: Learn More
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

Which is the best Data Model

sunindiaMarcus_Sommer


Hi,


Total I have created 3 schemas for my dashboard. Not sure which schema to use.

  • Pure star schema
  • Schema with Link table
  • Star schema with out Center Fact Table

Now I should suggest my organization to use the BEST Schema depending on Performance wise. Please provide me with advantages, disadvantages and the best method use ?

1.      PURE STAR SCHEMA:

purestar schema.png

In this schema Mem Id and Month_Year is the common field in all the tables. DW_MEM_MONTH_XTRA  is the fact table and rest of them were dimension tables.

With this pure star schema the size of the QVW application became 89MB. Total 59 fields.

In this schema I have used so many # of distinct values in the expressions.

  1. 2. STAR SCHEMA WITH OUT CENTER FACT TABLE:

star schema without.png

In this schema Mem Id and Month_Year is the common field in all the tables. DW_MEM_MONTH_XTRA  is the fact table and rest of them were dimension tables.

With this star schema the size of the QVW application became 42MB. Total 55 fields.

In this schema I have used so many # of distinct values in the expressions.

  1. 3. SCHEMA WITH LINK TABLE:

linktable.png

In this schema Mem Id and Month_Year is the common field in all the tables. DW_MEM_MONTH_XTRA  is the fact table and rest of them were dimension tables.

With this star schema the size of the QVW application became 49MB. Total 65 fields. Because of the Link table the # of  rows increased=2586767

In this schema I have NOT used # of distinct values in the expressions.

please help.

Thanks,

Rajitha

1 Solution

Accepted Solutions
Anonymous
Not applicable
Author

Ravi ,

let me put like this and correct if i am wrong as per my understanding the Model number is 2  good because:

1. When we compare model 2 with that of model 1 . The link Key is present in the all the tables we are replicating the same column 5 times in the center table in order to get to the true look of the start schema. Because of this additional columns the size of the app is increasing to 89 MB. So the memory consumption when opening the app would be more .

2. When the model 2 is compared to model 3 , the disadvantage in model 3 is since we are creating a link table in the memory which is an additional table concatenated from all the tables . This additional table is also taking some additional space of 9 MB in the disk which and little more memory consumption when opening up the app. I know which is very minor. But wanted to come to the true comparision.

View solution in original post

3 Replies
ravishankarqv
Contributor III
Contributor III

Hi,

In general, go for START SCHEMA, which is easy to maintain.

As per your analysis, 3 seems to be good in terms memory and less number of distinct expression (I am not sure its impact on application).

What are your key business constraints?

How large this data can grow further?

Rgds

Ravi

Anonymous
Not applicable
Author

Ravi ,

let me put like this and correct if i am wrong as per my understanding the Model number is 2  good because:

1. When we compare model 2 with that of model 1 . The link Key is present in the all the tables we are replicating the same column 5 times in the center table in order to get to the true look of the start schema. Because of this additional columns the size of the app is increasing to 89 MB. So the memory consumption when opening the app would be more .

2. When the model 2 is compared to model 3 , the disadvantage in model 3 is since we are creating a link table in the memory which is an additional table concatenated from all the tables . This additional table is also taking some additional space of 9 MB in the disk which and little more memory consumption when opening up the app. I know which is very minor. But wanted to come to the true comparision.

marcus_sommer

To compare only the app-size isn't enough to decide which approach might be the most suitable - especially by rather small apps like this one. Furthermore you would need to check the records and sizes of tables and fields and the calculation times of your objects with a mem-file analysis: Recipe for a Memory Statistics analysis in conjunction with various optimizing technics: Document Analyzer update 2.7 and The Importance Of Being Distinct.

Another important point are load-times and also the efforts to develop and maintain such a datamodel: Data Modelling: Clarity vs. Speed.

Personal I prefer the star-scheme with a single fact-table and some dimension-tables around and I implement a link-table approach only if it's necessary (by having more then one fact-table which I didn't could merge into one single fatc-table without other disadvantages) and I never use your preferred second approach and if you looked through various material here within the community: More advanced topics of qlik datamodels or within books: Books and literature to the topic of creating datamodels you wouldn't find a recommendation for it (this doesn't meant it won't in general work but it has definite disadvantages).

- Marcus