There are a number of prefixes in QlikView, that help you load and transform data. One of them is the Generic prefix.
Whenever you have a generic database, the Generic prefix can be used to transform the data and create the desired fields. A generic database is basically a table where the second last column is an arbitrary attribute and the very last is the value of the attribute. In the input table below you have a three-column generic database.
But if you want to analyze this data, it is much easier to have all attributes in separate fields so that you can make the appropriate selections. It is not very practical to have one single field for all attribute values, since you may want to make selections using different attributes at the same time.
Enter the Generic prefix.
It converts the data to a structure where each attribute is placed in a field of its own. Another way to express it is to say that it takes field values and converts these to field names. If you compare it to the Crosstable prefix, you will find that they in principle are each other’s inverses.
The syntax is
GenericLoadKey, Attribute, ValueFrom … ;
There are however a couple of things worth noting:
Usually the input data has three columns: one qualifier field (Key in the above example), an Attribute and a Value. But you may also have several qualifying fields. If you have four or more columns, all columns except the two last will be treated as qualifying fields.
The Generic prefix will create several tables; one table per attribute. This is normally not a problem. Rather, it is an advantage: It is the least memory-consuming way to store data if you have many attributes.
If you have more than one key, this means that you will get a composite key – a synthetic key – in the data model:
Although it looks ugly, this synthetic key is completely harmless. But it may still be a good idea to replace it with a manually created concatenated key:
Autonumber(Key1 & '|' & Key2 & '|' & Key3) asKey,
Finally, I have seen many examples on QlikCommunity where a For-Next loop is used to join together all tables created by the Generic prefix, e.g.:
The result is one big table that contains all attributes; a table that often is sparse (containing many NULL values) and much larger than the initial tables. And no performance has been gained… So I can only say:
You should not do this - unless you have a specific reason to.
The Generic prefix creates a set of tables that store the data in an optimal way. In most cases you should not change this. I realize, however, that there are cases where you want to transform the data further and need the data in one, unified table. Then the above scriptlet can be used.
Yes, It is. I miss the part of script where you create a key. In this case LEFT JOIN work. In my scripts I use OUTER JOIN, and I don't create this tables. My script is:
"Although it looks ugly, this synthetic key is completely harmless."
Thx thats nice to know, but how do I do this conclusion, when I dont have you around :-)?
Some more general background knowledge would be nice. I am still a bid confused with Qlik, the general databases rules does not always make sense in Qlik. Sometimes it does, sometimes it does not. Sometimes I could just put it all in one big pot, and other times it is crucial to do at least 3rd normalization etc., split it up in a billion tables.
I kind of miss "specfic Qlik - general database theory" if that makes sense. Is there anything like that?
I think it is fairly straightforward: Synthetic keys are not dangerous and should not be avoided, unless you have several of them and they "interfere" with each other, e.g. if one $Syn key is based on another. In such a case you should re-think your data model.
But if you have one single synthetic key that is "symmetric"; that uses the same constituent fields in all places where it is used, and you feel that the data model is OK, the synthetic key is most likely harmless.
But - you must of course always ask yourself whether it makes sense to use several keys between two tables; whether the data model is OK. Often one of the keys is redundant, and should therefore not be used.
You can use the for-next loop I have in the blog post. The only changes you need to make are
What I call 'GenericLabel', you call 'Projects'. (the prefix in the table names)
The Load statement that loads from 'GenericDB' should be changed to a load statement that loads from project-info.xlsx.
But what I don't understand is why you use autonumber to create the keys. That will result in that the same attribute (e.g. Project Name) all have the same Key value. It seems to me as if you should have an ID for the project as Key...
If the data is like in the picture and the same structure is repeated for each project - i.e. the first line of a project section is a line where 'Project Reference' is found in the first column - then you can create your project ID the following way:
tmpTable:
Load
If(@1='Project Reference',@2,Peek(ProjectID)) as ProjectID,
@1 as ProjectAttribute,
@2 as ProjectValue
From ...
Then, in a second step, you can do the Generic Load: