Re: Why is a Concatenated table bad for the RAM co... - Qlik Community

jblomqvist · ‎2015-08-18

Hi all,

I was reading QlikView for Developers book and it said in there that a Star Schema is the best type of schema to aim for (something I have read before elsewhere and here).

But I am curious, why is a Concatenated table bad for the RAM consumption compared to Star Schema?

robert_mika · ‎2015-08-18

Because you have all data in one table (RAM) so is not much work done from CPU.

When you join(Star Schema) you use the CPU (processor) power.

This is of course depends of how much data you have so you have to find solution that based suit your data model.

daniel_kusiak · ‎2015-08-18

Hi Robert Mika, maybe it's a good topic for your next guide 🙂

jblomqvist · ‎2015-08-18

Hi Robert,

Could you please breakdown what you mean by CPU used more when a star schema is applied?

I understand the bit about one table having all data means it is all loaded on the RAM.

Just want to know more about what you mean about the CPU utilising the CPU more in a star schema

Oleg_Troyansky · ‎2015-08-18

You need to clarify your terminology - what do you call a "concatenated table" and how is it different, in your opinion, from a Star Schema.

The terminology that I use is the following:

Star Schema is a kind of an analytical data structure (as opposed to a transactional one) that satisfied the following Requirements:

1. There is only one Fact table that contains all the transactions in the data model.

2. There is a number of Dimensional tables surrounding the Fact, and each Dimensional table is located not further than 1 link away from the Fact (in other words, all Dimensional tables are associated to the Fact but not to other Dimensional tables).

3. All the fields that participated in Expressions (metrics, flags, etc...) are stored in the Fact table.

Concatenated table is one of the commonly used data modelling techniques that is used to combine multiple transactional tables into a single ("concatenated") Fact table. The result of this technique is usually a Star Schema, or a Snowflake if Dimensional tables happen to be associated between them.

Star Schema is typically the best performing option of all, for two main reasons - the metrics are all stored in a single table, and Dimensions are not further than one link away. That being said, a Snowflake schema may only be 10-15% slower than a Star. This difference can be negligible in smaller data sets.

Now, having clarified the terminology, would you like to clarify your question?

You can read more about QlikView data modeling in my new book "QlikView Your Business" that just become available for purchase.

cheers,

Oleg Troyansky

QlikView Your Business: An expert guide to Business Discovery with QlikView and Qlik Sense

Oleg Troyansky
Call me for expert support, performance tuning, advanced training and more!

robert_mika · ‎2015-08-18

From what I have been learned star Schema or any type of join use CPU power.

Now if you have system with lots of RAM that does not affect the performance(well depends on data load).

but if you do not have enough RAM is better to use processor for calculations so joins are better than concatenated tables.

robert_mika · ‎2015-08-18

Thanks Daniel for mentioning.

At the moment I will probably will utilized the 400 "missing" functions but I will keep that in mind when I run out of topics

🙂

Why is a Concatenated table bad for the RAM consumption compared to Star Schema?