Qlik Community

Qlik Design Blog

All about product and Qlik solutions: scripting, data modeling, visual design, extensions, best practices, etc.

Employee
Employee

Autonumber function

Today I decided to blog about the Autonumber function that can be used to create a “compact memory representation of a complex key.”  Having recently learned about this function, I realized there have been times in the past when this would have been helpful to use.  For instance, when I need to build a link table in my data model, I often create keys that I use to link the tables.  Sometime these key fields are lengthy and are a combination of 3 or 4 fields.  In this blog, I will show you how you can use the Autonumber function to create a “compact memory representation of a complex key.”

Assume I load a data set that looks like this:

Data to load.png

And I want to load another data set that looks like this:

Data to load2.png

These two data sets have the same first four fields so if I were to load them as is, I would get a synthetic table in my data model.  To avoid that I will set up a key field in each of the tables that includes the FoodCategory, StoreNo, Year and Month fields.  This key field will be the field that links these two tables.  I will do this using a preceding load when I load both of these tables.  The script would look like this:

script.png

In the first table, I am using a preceding load to load all fields and then I am using the Autonumber function to create a key field that represents the four fields: FoodCategory, StoreNo, Year and Month.  I am doing the same thing in the second table I am loading but the difference here is that I am not loading the key fields.  By not loading the key fields, I am preventing a synthetic table from being loaded.  The end result looks like this:

table.png

Notice the FSYMkey field.  In this example, it is a unique integer that represents a larger expression.  In the past, I would have created the key field like this (see the FSYMkey2 field in the table below):

table2.png

FSYMkey2 is a more complex field that would have taken up more memory.  This example is small but if you had thousands of unique key fields like this, the consumed memory would add up.  By using the Autonumber function, I was able to use an integer to represent a long string thus minimizing the memory usage in my app.  This is one of many tricks that can be used to reduce the memory usage in your app.  Henric Cronstrom has some other ideas in his Symbol Tables and Bit-Stuffed Pointers blog.  Check it out.

Thanks,

Jennell

18 Comments

Agreed.  Using Autonumber() like this is, memory wise, phenomenally efficient as the values do not even get stored in the Symbol Tables - i.e. they take up zero bytes in the Symbol Tables.

0 Likes
1,271 Views
pljsoftware
Contributor III

Please, remember to do not use AutoNumber function when you want to store data in QVD file. AutoNumber function generate always number started from 1 for each load. So, different load time can generate different result number from the AutoNumber.

To reduce data for different fieds generated by AutoNumber function use AutoNumber(field_1 & ... & field_N, 'AutoNumberKey').

1,271 Views
MVP & Luminary
MVP & Luminary

‌Remember to always use the second parm to autonumber -- the "autoid" to ensure your values are sequential when autonumbering more than one field. Otherwise you will lose the symbol table optimization.

Rob

1,271 Views
gardenierbi
Contributor

And use a seperator between the fields!

Otherwise it could lead to duplicate id's :

Id1, Id2          -> AUTONUMBER(Id1 & Id2)

101, 12          -> AUTONUMBER(10112)

1011, 2          -> AUTONUMBER(10112)

Better :

Id1, Id2          -> AUTONUMBER(Id1 & '|' & Id2)

101, 12          -> AUTONUMBER(101|12)

1011, 2          -> AUTONUMBER(1011|2)

1,271 Views

Dear Sander,

We can write this way also

AUTONUMBER(Id1 &'-'& Id2) as Key2

Also Sander when i create two keys with different keys Ex:-

AUTONUMBER(Id1 & Id2) as Key1,

AUTONUMBER(Id1 &'-'& Id2) as Key2;

Then why second key start from 2and then 3.

Ex:-

LOAD *,

AUTONUMBER(Id1 & Id2) as Key1,

AUTONUMBER(Id1 &'-'& Id2) as Key2;

LOAD * Inline

[

Id1, Id2

101, 12

1011, 2

];

OutPut:-

   

Id1Id2Key1Key2
1011212
1011213

Regards

Anand

0 Likes
1,271 Views
joydipp1988
Contributor

Hi all,

I would like to share one practical approach of what @pljsoftware and @rwunderlich were trying to say.

In my sales app I had 3 keys. %Item_Name, %Party_Name and %Ledger_Name.

I applied 2 approaches to do this.

Case 1: No autoid is applied

  • Script

  /* ---------------------------------------- Ledger Master ---------------------------------------- */
Left Keep (FACT_TABLE)

LEDGER_MASTER:
LOAD Distinct
AutoNumber(%LEDGER_NAME) as %LEDGER_NAME,
@LEDGER_NAME,
@LEDGER_LOCATION
FROM $(vMasterQVDs)LEDGER_MASTER.qvd(qvd);

/* ---------------------------------------- Party Master ---------------------------------------- */
Left Keep (FACT_TABLE)

PARTY_MASTER:
LOAD Distinct
AutoNumber(%PARTY_NAME) as %PARTY_NAME,
@PARTY_NAME
FROM $(vMasterQVDs)PARTY_MASTER.qvd(qvd);

/* ---------------------------------------- Item Master ---------------------------------------- */
Left Keep (FACT_TABLE)

ITEM_MASTER:
LOAD
AutoNumber(%ITEM_NAME) as %ITEM_NAME,
@ITEM_NAME,
@ITEM_CATEGORY
FROM $(vMasterQVDs) ITEM_MASTER.qvd(qvd);


  • Output

No AutoId applied.PNGSo here sequential no gets generated but distributed over all the Autonumbered Key.

Case 2: Autoid is applied


  • Script

/* ---------------------------------------- Ledger Master ---------------------------------------- */
Left Keep (FACT_TABLE)

LEDGER_MASTER:
LOAD Distinct
AutoNumber(%LEDGER_NAME,'%LEDGER_NAME') as %LEDGER_NAME,
@LEDGER_NAME,
@LEDGER_LOCATION
FROM $(vMasterQVDs)LEDGER_MASTER.qvd(qvd);

/* ---------------------------------------- Party Master ---------------------------------------- */
Left Keep (FACT_TABLE)

PARTY_MASTER:
LOAD Distinct
AutoNumber(%PARTY_NAME,'PARTY_NAME') as %PARTY_NAME,
@PARTY_NAME
FROM $(vMasterQVDs)PARTY_MASTER.qvd(qvd);

/* ---------------------------------------- Item Master ---------------------------------------- */
Left Keep (FACT_TABLE)

ITEM_MASTER:
LOAD
AutoNumber(%ITEM_NAME,'%ITEM_NAME') as %ITEM_NAME,
@ITEM_NAME,
@ITEM_CATEGORY

FROM $(vMasterQVDs) ITEM_MASTER.qvd(qvd);

  • Output

AutoId applied.PNG

Here each Key starts with 1 and generating sequential integers for individual fields.

Memory Consumption

Memory consumption.PNGSee the memory consumption of the above two approaches. Although the app size is very small but still there has an impact of using Autonumber() with the autoid.

0 Likes
1,271 Views
gardenierbi
Contributor

@its.anandrjs‌‌ it doesn't care which seperator you use as long you select one or multiple to create unique combinations.

0 Likes
1,271 Views
jolivares
Valued Contributor

Continuing with the example of Sanders, I think you can solve this using the function AutoNumberHashXXX.

Load *, AutoNumber(F1&F2,'AN1') as AN,

     AutoNumber(F1&'-'&F2,'AN2') as AN2,

     AutoNumberHash128(F1,F2,'AN3') as AN3;

LOAD * INLINE [

    F1, F2

    101, 12

    1011, 2

];

Capture.PNG

As you can see the first case as mention before, it gives you the same result, but using a separator or the function AutoNumberHash128, it solve the problem.

1,271 Views
krt
New Contributor III

Yes, but as mentioned prior, don't use autonumberhashxxx when building composite keys in your QVDs, which often get built and loaded at different times.  So one approach is to use the hashxxx(field1,field2...fieldn) when building composite keys in QVDs (fields can be listed without a separator) and then use autonumber() on the keys when loading from QVDs in order to conserve memory.  Since autonumber is only referencing one field (the prebuilt hash key), you don't need the autoID, but it's good practice to get in the habit in case you are autonumbering more than one field. 

Per the Qlik help site:

You can only connect autonumberhash256 keys that have been generated in the same data load, as the integer is generated according to the order the table is read. If you need to use keys that are persistent between data loads, independent of source data sorting, you should use thehash128, hash160 or hash256 functions.



0 Likes
1,271 Views
MVP & Luminary
MVP & Luminary

kyleroth

"Since autonumber is only referencing one field (the prebuilt hash key), you don't need the autoID".


I don't think that statement is correct. The AutoID has nothing to do with the number of fields you are using as input parameters. Rather AutoID is required when you are creating more than one output fieldname -- to ensure sequential numbers.

-Rob

0 Likes
1,271 Views
vikasmahajan
Esteemed Contributor

Nice

0 Likes
1,271 Views
krt
New Contributor III

Rob - you are correct.  I misunderstood it's function.  Thanks for correcting.

Kyle

0 Likes
1,271 Views
benrig44
New Contributor III

Rob,

"Remember to always use the second parm to autonumber -- the "autoid" to ensure your values are sequential when autonumbering more than one field. Otherwise you will lose the symbol table optimization."

If by Symbol Table Optimization you mean that the field value is not stored and only the pointer is, that's not true. I can only speak of V11SR12 QVW (classic). Using the AutoID parameter or not still maintains the optimization.

0 Likes
1,271 Views
MVP & Luminary
MVP & Luminary

Steven,

The AutoID parameter is only important when autonumbering more than one field name. You can leave it off if only doing one field -- is that your case? 

-Rob

0 Likes
1,271 Views
benrig44
New Contributor III

Rob, I think we are getting our wires crossed. I am OK with the concept of the need for AutoID or not. I am responding to what I think is your assertion that the special optimization that AutoNumber has only happens when the AutoID parameter is used.

The special optimization being that if the output of AutoNumber is stored in a field then that field's symbol table only stores the pointer and not the value as well. That avoidance of storing the values as well saves memory in the symbols table area of memory.

Maybe I was connecting the logic dots incorrectly in your statement below:

"Remember to always use the second parm to autonumber -- the "autoid" to ensure your values are sequential when autonumbering more than one field. Otherwise you will lose the symbol table optimization."

Steven

0 Likes
1,271 Views
benrig44
New Contributor III

Joydip,

I recommend you do not use the QVW file size to give you an accurate insight memory usage. There is a standard feature of QVW that can do that far more accurately.

Use
QVW Desktop \ Document Properties \ General \ Memory Statistics

You can also use this feature combined with Rob Wunderlich's excellent Document Analyzer tool available at:

http://qlikviewcookbook.com/tools/#squelch-taas-accordion-shortcode-content-2

0 Likes
1,271 Views
MVP & Luminary
MVP & Luminary

Steven,

"Otherwise you will lose the symbol table optimization".


The "symbol table optimization" (my made-up name) has nothing to do with AutoNumber per se. It's that a field containing only sequential integers does not require a symbol table, thereby saving some memory.

Recommending AutoID for AutoNumber is just a way to ensure that the output of AutoNumber will be sequential with no gaps. Gaps would likely occur if you autonumbered more than one field and omitted AutoID.

-Rob

1,271 Views
benrig44
New Contributor III

Ah got you now Rob. Yes AutoNumber, IterNo, RowNo, and RecNo falling into the same category regard to integer counters saving memory by not storing values. But, as you say, if something interferes with that +1 numbering strategy the memory saving technique does not apply.

0 Likes
1,271 Views