Solved: Which is the best Data Model - Qlik Community

Anonymous · ‎2016-03-01

Hi,

Total I have created 3 schemas for my dashboard. Not sure which schema to use.

Pure star schema
Schema with Link table
Star schema with out Center Fact Table

Now I should suggest my organization to use the BEST Schema depending on Performance wise. Please provide me with advantages, disadvantages and the best method use ?

1. PURE STAR SCHEMA:

In this schema Mem Id and Month_Year is the common field in all the tables. DW_MEM_MONTH_XTRA is the fact table and rest of them were dimension tables.

With this pure star schema the size of the QVW application became 89MB. Total 59 fields.

In this schema I have used so many # of distinct values in the expressions.

2. STAR SCHEMA WITH OUT CENTER FACT TABLE:

In this schema Mem Id and Month_Year is the common field in all the tables. DW_MEM_MONTH_XTRA is the fact table and rest of them were dimension tables.

With this star schema the size of the QVW application became 42MB. Total 55 fields.

In this schema I have used so many # of distinct values in the expressions.

3. SCHEMA WITH LINK TABLE:

In this schema Mem Id and Month_Year is the common field in all the tables. DW_MEM_MONTH_XTRA is the fact table and rest of them were dimension tables.

With this star schema the size of the QVW application became 49MB. Total 65 fields. Because of the Link table the # of rows increased=2586767

In this schema I have NOT used # of distinct values in the expressions.

please help.

Thanks,

Rajitha

Anonymous · ‎2016-03-01

Ravi ,

let me put like this and correct if i am wrong as per my understanding the Model number is 2 good because:

1. When we compare model 2 with that of model 1 . The link Key is present in the all the tables we are replicating the same column 5 times in the center table in order to get to the true look of the start schema. Because of this additional columns the size of the app is increasing to 89 MB. So the memory consumption when opening the app would be more .

2. When the model 2 is compared to model 3 , the disadvantage in model 3 is since we are creating a link table in the memory which is an additional table concatenated from all the tables . This additional table is also taking some additional space of 9 MB in the disk which and little more memory consumption when opening up the app. I know which is very minor. But wanted to come to the true comparision.

View solution in original post

ravishankarqv · ‎2016-03-01

Hi,

In general, go for START SCHEMA, which is easy to maintain.

As per your analysis, 3 seems to be good in terms memory and less number of distinct expression (I am not sure its impact on application).

What are your key business constraints?

How large this data can grow further?

Rgds

Ravi

Anonymous · ‎2016-03-01

Ravi ,

let me put like this and correct if i am wrong as per my understanding the Model number is 2 good because:

1. When we compare model 2 with that of model 1 . The link Key is present in the all the tables we are replicating the same column 5 times in the center table in order to get to the true look of the start schema. Because of this additional columns the size of the app is increasing to 89 MB. So the memory consumption when opening the app would be more .

2. When the model 2 is compared to model 3 , the disadvantage in model 3 is since we are creating a link table in the memory which is an additional table concatenated from all the tables . This additional table is also taking some additional space of 9 MB in the disk which and little more memory consumption when opening up the app. I know which is very minor. But wanted to come to the true comparision.

marcus_sommer · ‎2016-03-02

To compare only the app-size isn't enough to decide which approach might be the most suitable - especially by rather small apps like this one. Furthermore you would need to check the records and sizes of tables and fields and the calculation times of your objects with a mem-file analysis: Recipe for a Memory Statistics analysis in conjunction with various optimizing technics: Document Analyzer update 2.7 and The Importance Of Being Distinct.

Another important point are load-times and also the efforts to develop and maintain such a datamodel: Data Modelling: Clarity vs. Speed.

Personal I prefer the star-scheme with a single fact-table and some dimension-tables around and I implement a link-table approach only if it's necessary (by having more then one fact-table which I didn't could merge into one single fatc-table without other disadvantages) and I never use your preferred second approach and if you looked through various material here within the community: More advanced topics of qlik datamodels or within books: Books and literature to the topic of creating datamodels you wouldn't find a recommendation for it (this doesn't meant it won't in general work but it has definite disadvantages).

- Marcus

Which is the best Data Model

Related Topics