Qlik Community

Qlik Design Blog

All about product and Qlik solutions: scripting, data modeling, visual design, extensions, best practices, etc.

The Generic Load

There are a number of prefixes in QlikView, that help you load and transform data. One of them is the Generic prefix.

Whenever you have a generic database, the Generic prefix can be used to transform the data and create the desired fields. A generic database is basically a table where the second last column is an arbitrary attribute and the very last is the value of the attribute. In the input table below you have a three-column generic database.

 

Generic transformation4.png

 

But if you want to analyze this data, it is much easier to have all attributes in separate fields so that you can make the appropriate selections. It is not very practical to have one single field for all attribute values, since you may want to make selections using different attributes at the same time.

 

Enter the Generic prefix.

 

It converts the data to a structure where each attribute is placed in a field of its own. Another way to express it is to say that it takes field values and converts these to field names. If you compare it to the Crosstable prefix, you will find that they in principle are each other’s inverses.

 

The syntax is

 

   Generic Load Key, Attribute, Value From … ;

 

There are however a couple of things worth noting:

  • Usually the input data has three columns: one qualifier field (Key in the above example), an Attribute and a Value. But you may also have several qualifying fields. If you have four or more columns, all columns except the two last will be treated as qualifying fields.
  • The Generic prefix will create several tables; one table per attribute. This is normally not a problem. Rather, it is an advantage: It is the least memory-consuming way to store data if you have many attributes.

 

If you have more than one key, this means that you will get a composite key – a synthetic key – in the data model:

 

Multiple keys3.png

 

Although it looks ugly, this synthetic key is completely harmless. But it may still be a good idea to replace it with a manually created concatenated key:

 

   Autonumber(Key1 & '|' & Key2 & '|' & Key3) as Key,

 

Finally, I have seen many examples on QlikCommunity where a For-Next loop is used to join together all tables created by the Generic prefix, e.g.:

 

   Set vListOfTables = ;

   For vTableNo = 0 to NoOfTables()

      Let vTableName = TableName($(vTableNo)) ;

      If Subfield(vTableName,'.',1)='GenericLabel' Then

         Let vListOfTables = vListOfTables & If(Len(vListOfTables)>0,',') & Chr(39) & vTableName & Chr(39) ;

      End If

   Next vTableNo

 

   CombinedGenericTable:

   Load distinct Key From GenericDB;

 

   For each vTableName in $(vListOfTables)

      Left Join (CombinedGenericTable) Load * Resident [$(vTableName)];

      Drop Table [$(vTableName)];

   Next vTableName

 

The result is one big table that contains all attributes; a table that often is sparse (containing many NULL values) and much larger than the initial tables. And no performance has been gained… So I can only say:

 

You should not do this - unless you have a specific reason to.

 

The Generic prefix creates a set of tables that store the data in an optimal way. In most cases you should not change this. I realize, however, that there are cases where you want to transform the data further and need the data in one, unified table. Then the above scriptlet can be used.

 

HIC

40 Comments
Partner
Partner

hi HIC,

very nice

Best Regards,

Kalyan.D

0 Likes
14,294 Views
Not applicable

Hey Henrich,

Thanks, this is a very good overview of a feature that I at first didn't see any use for. But now I can see that I underestimated this. I have a similar case with assets and attributes in part of our inventory. Until now I converted the attributes I used in the analysis in a more cumbersome way (not that there were many).

However, I seem to have over 1200 distinct attributes. With in total about 1.3 M values. Is there a limit to how many tables can be created in QlikView? The table view won't look that good :-).

Off course, if I would make use of this, I would first filter out the attributes that I'm not interested in (then translate the attribute names to something useful as label) and then do this generic load.

0 Likes
14,294 Views

That is indeed a lot of data... but QlikView should be able to handle it anyway.

One possibility is to make a hybrid solution: Load the 3-column generic database as it is (no transformation) and in addition make a generic load with an appropriate filter that loads e.g. the 10 most important attributes. The tables will link on the key field and you can make searches both in the transformed and the raw data.

HIC

0 Likes
14,294 Views
Not applicable

Thanks, that's what I thought. The always annoying thing is that I'll need to deal with translations also. I use extra columns for this so my generic table would look more like:

- ID asset

- Attribute

- Attribute (nl)

- Attribute (fr)

- Value

- Value (nl)

- Value (fr)

I'll have to write a script that joins the translations into these new tables.

0 Likes
14,294 Views

A comment on the script in the post: It makes a Left Join in each loop - something that could be very memory consuming if you have many attributes. An alternative is to use a Group by combined with Subfield. The For-Next loop creates variables that contain field lists used in the Load. It should perform better if you have many attributes, but I haven't verified this. But go ahead and try it:

AttributeList:

Load distinct Attribute From GenericDB;

Set vFieldList1 = ;

Set vFieldList2 = ;

For vAttributeNo = 0 to NoOfRows('AttributeList') - 1

   Let vAttribute = Peek('Attribute',vAttributeNo,'AttributeList') ;

   Let vField = 'If(Attribute=' & chr(39) & vAttribute & chr(39) & ', Value)';

   Let vField1 = 'Concat(distinct ' & vField & ','';'') as ' & vAttribute;

   Let vFieldList1 = vFieldList1 & ',' & vField1 ;

   Let vField2 = 'If(Len(' & vAttribute & ')>0,Trim(Subfield(' & vAttribute & ','';''))) as ' & vAttribute;

   Let vFieldList2 = vFieldList2 & ',' & vField2 ;

Next vAttributeNo

GenericDB:

Load Key

  $(vFieldList2);

Load Key

  $(vFieldList1)

  From GenericDB Group By Key;

14,294 Views
egnaldo_sousa
Contributor II

Thanks

0 Likes
14,294 Views
joaquinlr
Valued Contributor II

Fantastic, HIC.

0 Likes
14,294 Views
MVP & Luminary
MVP & Luminary

I find a slightly easier form of the "ReJoin" scriptlet to be iterating the tablelist backwards because it requires only one FOR-NEXT loop:

CombinedGenericTable:

Load distinct Key From GenericDB;

  

FOR i = NoOfTables()-1 to 0 STEP -1

  LET vTable=TableName($(i));

  IF WildMatch('$(vTable)', 'GenericLabel.*') THEN

  LEFT JOIN ([CombinedGenericTable]) LOAD * RESIDENT [$(vTable)];

  DROP TABLE [$(vTable)];

  ENDIF

NEXT i

Or if you are using QVC:

CALL Qvc.JoinGenericTables('CombinedGenericTable', 'GenericLabel');

14,294 Views
Not applicable

We've found a number of uses for this Generic load - mainly around Question and Answer pairs for Customer Surveys. It make the analysis of each of the customers responses much easier, and allows for a very speedy solution.

14,294 Views
Partner
Partner

Hi

     This is verry good post for this powerful function. But my experian with this function say, that in the script to make 'combine new table' we must use OUTER JOIN, not LEFT JOIN. Because in source table some of attributes can be without value.

Best regards

Dimitar

0 Likes
14,294 Views
MVP & Luminary
MVP & Luminary

Dimitar,

We start by populating the final table with all Keys. So it appears to me that there would be no difference in this case between a LEFT or OUTER JOIN.

-Rob

14,294 Views

I agree completely with Rob Wunderlich here. Further, I'd like to add that you should always start such a multiple join with a

   Load distinct Key From ....

to make sure that you have all keys from the very beginning. It makes the repeated Join faster and more stable.

HIC

0 Likes
14,294 Views
Partner
Partner

Hi Rob,

   Yes, It is. I miss the part of script where you create a key. In this case LEFT JOIN work. In my scripts I use OUTER JOIN, and I don't create this tables. My script is:

// Join table after generic load

SUB JoinTable(wTblName,TblName)

  let i=0;

  let tblToDrop = null();

  for tbl=0 to NoOfTables()-1

  let tB = TableName($(tbl));

  if wildmatch('$(tB)','$(wTblName)*')>0 then

  let i=i+1;

  if i=1 then

  let renTable = '[$(tB)]' ;

  else

  let tblToDrop = tblToDrop & chr(39) & '[$(tB)]' & chr(39) & ',' ;

  end if

  end if

  NEXT tbl

  let tblToDrop = left(tblToDrop,len(tblToDrop)-1);

  RENAME Table $(renTable) to [$(TblName)];

  let strKey = Null() ;

  for i = 1 to NoOfFields('$(TblName)')-1

  let strKey = strKey & FieldName(i,'$(TblName)') & ',';

  next

  for each tbl in $(tblToDrop)

  let Rows = null();

  let fldDataValue = null();

  let tt = PurgeChar('$(tbl)','[]') ;

  // === Check is this field empty

  let Rows = NoOfRows('$(tt)') ;

  let fldData = TextBetween(TextBetween('$(tbl)','[',']'),'.','');

  wCnt:

  load NumericCount(Num#([$(fldData)],'0')) as tx resident $(tbl) group by 1;

  let fldDataTEXT = Peek('tx',0,'wCnt');

  drop table wCnt;

  let strLoad = strKey ;

  if '$(Rows)'='0' then

  drop table $(tbl);

  Else

  If '$(fldDataTEXT)'='0' then

  let strLoad = strLoad & 'text([' & fldData & ']) as [' & fldData & ']' ;

  Outer Join ([$(TblName)])

  load $(strLoad) Resident $(tbl);

  drop table $(tbl);

  Else

  if '$(Rows)'='$(fldDataTEXT)' then

  if WildMatch(lower(fldData),'*код*','*code*','*номер*','*склад*','*шофьор*','*saletype*','*група*')>0 then

  let strLoad = strLoad & 'text([' & fldData & ']) as [' & fldData & ']' ;

  ElseIf WildMatch(lower(fldData),'*дата*','*date*')>0 then

  let strLoad = strLoad & 'alt( date#([' & fldData & '],' & chr(39) & 'YYYY/MM/DD' & chr(39) & '),date#([' & fldData & '],' & chr(39) & 'MM/DD/YYYY' & chr(39) & '),date#([' & fldData & '],' & chr(39) & 'MM/DD/YY' & chr(39) & '),date#([' & fldData & '],' & chr(39) & 'YYYY.MM.DD' & chr(39) & '),date#([' & fldData & '],' & chr(39) & 'MM.DD.YYYY' & chr(39) & '),date#([' & fldData & '],' & chr(39) & 'MM.DD.YY' & chr(39) & '),date#([' & fldData & '],' & chr(39) & 'YYYYMMDD' & chr(39) & '),date#([' & fldData & '],' & chr(39) & 'MMDDYYYY' & chr(39) & '),date#([' & fldData & '],' & chr(39) & 'MMDDYY' & chr(39) & ')) as [' & fldData & ']';

  Else

  let strLoad = strLoad & 'num(num#([' & fldData & '],' & chr(39) & '0' & Chr(39) & ')) as [' & fldData & ']' ;

  End If

  else

  If WildMatch(lower(fldData),'*дата*','*date*')>0 then

  let strLoad = strLoad & 'alt( date#([' & fldData & '],' & chr(39) & 'YYYY/MM/DD' & chr(39) & '),date#([' & fldData & '],' & chr(39) & 'MM/DD/YYYY' & chr(39) & '),date#([' & fldData & '],' & chr(39) & 'MM/DD/YY' & chr(39) & '),date#([' & fldData & '],' & chr(39) & 'YYYY.MM.DD' & chr(39) & '),date#([' & fldData & '],' & chr(39) & 'MM.DD.YYYY' & chr(39) & '),date#([' & fldData & '],' & chr(39) & 'MM.DD.YY' & chr(39) & '),date#([' & fldData & '],' & chr(39) & 'YYYYMMDD' & chr(39) & '),date#([' & fldData & '],' & chr(39) & 'MMDDYYYY' & chr(39) & '),date#([' & fldData & '],' & chr(39) & 'MMDDYY' & chr(39) & ')) as [' & fldData & ']';

  Else

  let strLoad = strLoad & 'text([' & fldData & ']) as [' & fldData & ']' ;

  End If

  End if

  Outer Join ([$(TblName)])

  load $(strLoad) Resident $(tbl);

  drop table $(tbl);

  End If

  end if

  next tbl

END sub

In this script I also check the data type in fields and make the format.

0 Likes
14,294 Views
bgarland
Contributor III

"If you compare it to the Crosstable prefix, you will find that they in principle are each other’s inverses."

That sentence alone was the key for me in finally understanding what a generic load is. Thanks!

0 Likes
14,294 Views
Not applicable

"Although it looks ugly, this synthetic key is completely harmless."

Thx thats nice to know, but how do I do this conclusion, when I dont have you around :-)?

Some more general background knowledge would be nice. I am still a bid confused with Qlik, the general databases rules does not always make sense in Qlik. Sometimes it does, sometimes it does not. Sometimes I could just put it all in one big pot, and other times it is crucial to do at least 3rd normalization etc., split it up in a billion tables.

I kind of miss "specfic Qlik - general database theory" if that makes sense. Is there anything like that?

14,294 Views

I think it is fairly straightforward: Synthetic keys are not dangerous and should not be avoided, unless you have several of them and they "interfere" with each other, e.g. if one $Syn key is based on another. In such a case you should re-think your data model.

But if you have one single synthetic key that is "symmetric"; that uses the same constituent fields in all places where it is used, and you feel that the data model is OK, the synthetic key is most likely harmless.

But - you must of course always ask yourself whether it makes sense to use several keys between two tables; whether the data model is OK. Often one of the keys is redundant, and should therefore not be used.

HIC

14,294 Views
Not applicable

Henric Cronström

Hello Henric, thank yo for sharing such useful topic. I work with a similar table used in your script, however I have two fields only

@1= ProjectAttributes

@2= ProjectValue

project-info.png


My script is:

Projects:

GENERIC LOAD

Autonumber(@1) as %Projects_Key,

@1   as ProjectAttribute,

@2   as ProjectValue

FROM project-info.xlsx;

Here si the data model

generic-prefix-map.png

I would like to consolidate the tables into one using a For-Next loop becasue I have lots of Project Attributes.

I have no GENERICDB table where Key field is stored. I create my own Key (see %Projects_Key), so that your script below does not work for me.

CombinedGenericTable:

Load distinct Key From GenericDB;

Could you please advise on how I can consolidate the tables?

Thank yo for yrou support and please ask if the abpve does not make sense

Regards, Andrea

0 Likes
14,294 Views

You can use the for-next loop I have in the blog post. The only changes you need to make are

  • What I call 'GenericLabel', you call 'Projects'. (the prefix in the table names)
  • The Load statement that loads from 'GenericDB' should be changed to a load statement that loads from project-info.xlsx.

But what I don't understand is why you use autonumber to create the keys. That will result in that the same attribute (e.g. Project Name) all have the same Key value. It seems to me as if you should have an ID for the project as Key...

HIC

0 Likes
14,294 Views
Not applicable

Hic, I will try your options.

The project ID is in the excel files: cell B2. See image above.

I do not have a column Key (like your example have). hence why I was creating my own Key (%Projects_Key) using column A, the projects attributes.

What are your thoughts?Thank yof oryour support.

Regards, Andrea

0 Likes
14,294 Views

If the data is like in the picture and the same structure is repeated for each project - i.e. the first line of a project section is a line where 'Project Reference' is found in the first column - then you can create your project ID the following way:

tmpTable:

Load

   If(@1='Project Reference',@2,Peek(ProjectID)) as ProjectID,

   @1 as ProjectAttribute,

   @2 as ProjectValue

From ...

Then, in a second step, you can do the Generic Load:

Projects:

Generic Load ProjectID, ProjectAttribute, ProjectValue Resident tmpTable;

Drop Table tmpTable;

HIC

0 Likes
14,294 Views
Not applicable

Henric Cronström

Thank you for your reply. Your solution worked perfectly.

Keep up with the excellent support and knowledge sharing via the blog.

Regards, Andrea

0 Likes
14,294 Views
geetaalhan
Contributor

Thanks for sharing .

0 Likes
14,294 Views
Not applicable

Really very useful

great work sir

Thanks for sharing

0 Likes
14,294 Views
brindlogcool
Contributor III

It is very good post.

Generic Load works well when there is a key value. When we dont have the key then we cant use the generic load.

0 Likes
14,294 Views
MVP & Luminary
MVP & Luminary

You don't have to have a Key, although in most cases you would. If you don't have a key you can use a dummy or generated value for the first field. eg

Generic Load 0 as Key, Attribute, Value...

0 Likes
14,294 Views
brindlogcool
Contributor III

Thanks Rob. I found any way of converting the rows in to columns through Peek Function and Concat.Also no for loops and Joins

In My Case i wasn't having keys.

0 Likes
14,294 Views
ankitbisht01
Contributor

very nice blog thank you

0 Likes
14,294 Views
Arjunarao
Honored Contributor II

Nice document. Thank for sharing.

0 Likes
14,294 Views
MVP
MVP

I don't think you're right about "much larger than the initial tables". I mean yes, conceptually, of COURSE it's much larger if you think of it as one big table with most of the cells null. But I'm pretty sure that's not how QlikView actually stores data internally. (Edit: Mind you, you'd probably know far better than me how the data is stored internally.)

I have one of these loop and joins working on a table with 16,225,873 rows in the original table and 561 fields in the resulting table, which I would expect to be fairly sparse, with maybe 80-90% of cells null. I did a little testing of the size of the original, of the generic before joining, and of the joined result.

                         Original  Generic    Joined
QVD KB             97,521         NA    52,440
QVW KB            50,205    38,008   13,143
In Memory MB     116.0      137.5       74.1

So for at least my example, the joined version makes more efficient use of both QVW size and in-memory size than the generic version with all the individual tables. This isn't meant to suggest that joined is in general better, but I believe it at least demonstrates that the picture is not nearly so clear as one might at first think.

And I'll be honest - I expected the generic and joined sizes to be virtually the same based on my vague understanding of how QlikView stores data. I was surprised that the joined version was so much smaller. I have no explanation right now. I would have thought that generic would, just as you said, "store the data in an optimal way", and that the joined version would be negligibly larger. Seems we were both wrong for at least my example.

All that said, I agree that you should have a REASON to do the joining. It appears that space might be a rather surprising reason. An easier-to-view structure when you pull up the table diagram might be a reason. And in my case, the need to store the resulting data as a QVD was the primary reason. But I would in general definitely start with the unjoined generic and see if it suits my needs. No reason to increase my script complexity and load times unless I have a compelling reason to do so.

14,294 Views

Of course it depends on the data which result you get: The sparsity of the joined table (number of NULLs) and the cardinality of the fields.

The symbol tables will be the same in both cases. (See  Symbol Tables and Bit-Stuffed Pointers)

In the generic version, there will be one data table per attribute, and these will have two columns and just the right number of records. Keys without this specific attribute will not be stored.

In the joined version there will be only one data table, with one column per field and potentially a very large number of records. Every key value must have a record. Empty cells will have NULL, and since also NULL has an index, this will use memory. NULL uses the same number of bits as the other values in the column.

Hence, the drawback of the generic version is that the key field is repeated in every data table. And the drawback of the joined version is the large number of records of the data table. The question boils down to which of the two is more costly memorywise. If most attributes are used by most key values, then the joined version obviously is more compact. But if you have a very large number of attributes, and each key value only uses a few of them, the joined version is (or ought to be) is a potential memory problem. 

But I must admit that I am a little surprised that the joined version is that much smaller...

HIC

14,294 Views