Qlik Community

Ask a Question

Qlik Design Blog

All about product and Qlik solutions: scripting, data modeling, visual design, extensions, best practices, etc.

Announcements
Join us March 10th, 7 Ways Modern Analytics Can Help You Take Smarter Action. REGISTER NOW
Henric_Cronström

In a well visited post on the community forum, John Witherspoon some time ago asked “Should We Stop Worrying and Love the Synthetic Key?” John’s post begins: “Synthetic keys have a bad reputation. The consensus seems to be that they cause performance and memory problems, and should usually or even always be removed. I believe that the consensus is wrong.” Here’s my view on this topic.

The creation of synthetic keys is simply QlikView’s way of managing composite keys. There is nothing strange or magic around it.

A single key is easy to manage: Just list all unique values in a symbol table (see Symbol Tables and Bit-Stuffed Pointers), and then link the data tables using a natural join.

Source table view.png

But a composite key is slightly different – there is no single symbol table that contains the relevant combinations of the multiple key fields. So QlikView needs to create such a table for all combinations: the $Syn table. In a way, you can say that the $Syn table is a symbol table for composite keys. In the data tables, the multiple keys are replaced by an identifier that uniquely identifies the combination of the values of the original keys: the $Syn key.

Hence, if you have the same set of multiple keys in two or more tables, the QlikView synthetic keys create a general, correct, compact and efficient solution. Synthetic keys do not per se cause performance and memory problems. They do not use a lot more memory than if you autonumber your own concatenated key. And they treat NULLs correctly, as opposed to an explicit concatenated key.

Hence: The synthetic key is in itself good and we should all love it.

Internal table view.png

However…  still, also I avoid synthetic keys. Why?

  1. A synthetic key is in my experience often a sign of a poorly designed data model. I say that, given the number of times I have found a synthetic key in the table viewer only to realize that I made a mistake in the script. If you get a synthetic key and didn’t expect it, I can only say: Back to the drawing board! You should most likely change your data model.
  2. QlikView creates an additional table (the $Syn table) that in many cases is superfluous: An additional table is the best solution if none of the data tables by itself completely spans the set of composite keys. But in real life, there is usually one table that contains all relevant combinations of the keys, and then this table can be used to store the clear text of the individual keys.
  3. For clarity, I like to create my own concatenated keys. It forces me to think and create a data model that I believe in. Removing the synthetic keys becomes a method to ensure a good data model, rather than a goal in itself.

But in principle, I totally agree with John’s initial conclusion: Any problem around synthetic keys is really a data modeling problem and not a problem with the synthetic key itself.

The short answer to John’s question is Yes and No. Yes, we should love the synthetic key. But, No, we should not stop worrying. We should always be alert and ask ourselves: “Do I want this synthetic key? Is the data model OK?

And so, because of the automated and irrevocable data-modeling process which rules out human meddling, the Synthetic Keys are scaring. But they are simple to understand. And completely credible and convincing.

Dr HIC

Further reading on Qlik data modelling:

Circular References

Fan traps and Chasm traps

35 Comments
Not applicable

Hi,

I have always trained my colleagues not to care about it unless it produces strange results.

Normally Qliktech was teaching to solve the synthetic keys at the beginning of the data-modelling process. However since I'm using the library of Rob Wunderlich Qlikview Components to help me with synthetic keys, I can solve the synthetic keys at the end of the data-modelling process.

Instrumental in this is the function Qvc.LinkTable which has the ability to solve the synthetic keys at the end, instead of the beginning of the data-modelling process.

I hope with this post it will bring it to a bigger audience.

Loops however are a bigger concern of me. I hope you can elaborate more on that in a future blog post.

Kind Regards,


Dion

8,201 Views
rbecher
MVP & Luminary
MVP & Luminary

Hi Henric,

I think a concatenated key will consume more memory because of it's higher cardinality. One recommended best practice is to split fields to save memory (eg. timestamps). This would lead into a syn key if it's a key field. How to decide?

Another question: Is the calculation effort of a link table comparable to a synkey table?

- Ralf

0 Likes
8,201 Views
Gabriel
Partner
Partner

HI HIC,

Personally this is very rich knowledge. Thanks for sharing your knowledge as always.

One thing that sprouts to mind is, could there be hidden agenda for Synthetic Key (table) that is not documented or a useful way of using synthetic key (table) and QlikTech has yet to make known?

Best Regards,

Gabriel


8,201 Views
Henric_Cronström

Ralf

I don't think that an autonumbered concatenated key will consume more memory: They should have identical cardinality. So I would not split fields that are keys.

A $Syn table is like a link table with only one key. The inference engine would treat these two cases identically. But a link table usually has several keys and with that usually many more records, and this will affect the calculation effort.

Gabriel

No, there is nothing hidden with the synthetic key. It is merely a necessity for the QlikView inference engine in order to treat composite keys correctly.

However, one aspect, which isn't obvious (but it is not hidden), is NULL value handling: Should records of different tables be linked if one of several linking keys have a NULL value and the other keys have matching values? My answer is: "No. NULL should never link." And this is also how the synthetic keys work. A manually created, concatenated key will usually link the records, though.

HIC

8,201 Views
Not applicable

Sir,

     You enlightened me about $Syn tables. Thank You.

0 Likes
8,201 Views
Not applicable

Very well explained. I agreed, someone who needs a Syn Key indeed has a bad written data model, always we can find a better solution.

Thanks Henric.

0 Likes
8,201 Views