Skip to main content
Announcements
Introducing Qlik Answers: A plug-and-play, Generative AI powered RAG solution. READ ALL ABOUT IT!
cancel
Showing results for 
Search instead for 
Did you mean: 
geogou1973
Creator
Creator

Avoid synthetic keys

Hello.

I have the following script. If i reload the script as below

LOAD ITEM_ID,

     BARCODE_ID,

     BARCODE,

     STORE_ID,

     SALES_DATE,

     SALES_WEEK,

     SALES_WEEKDAY,

     SALES_YEAR,

     SALES_MONTH,

     SALES_QTY,

     SALES_AMT,

     CURRENT_SELPRICE,

     SALES_PERC_FPA,

     SALES_AMT_FPA,

     PRICE_LIST,

     PRICE_NET_LIST,

     PRICE_HALF_NET_LIST,

     PRICE_NET_NET_LIST,

     COST_SALES

FROM C:\QVDS\SALES_2016_01.QVD (qvd);

LOAD ITEM,

     AMOUNT_FPA_TRIG,

     AR_EISAGWGIS,

     MODULE,

     SALES_YEAR,

     SALES_MONTH,

     SALES_DATE,

     AMOUNT_TRIG,

     QTY_TRIG,

     TRANS_TYPE,

     WHS_TRIG,

     CUSTOMER,

     AFM_CUST,

     STORE_ID,

     ITEM_ID

FROM C:\QVDS\FRANCHISE_SALES_2016_01.QVD (qvd);

i have the following image in the table view. Teble View.jpg

If i concatenate the two tables i avoid the synthetic keys but i have i huge delay to load the second table.

The two tables have more than 25.000.000 records for one month.

Is there any other way to avoid the synthetic keys and not have the delay of loading the second table ?

Thank you in advance.

11 Replies
sunny_talwar

You can try this if you want to remove the synthetic key:

LOAD AutoNumber(ITEM_ID&'|'&STORE_ID&'|'&SALES_DATE) as KEY,

    ITEM_ID,

    BARCODE_ID,

    BARCODE,

    STORE_ID,

    SALES_DATE,

    SALES_WEEK,

    SALES_WEEKDAY,

    SALES_YEAR,

    SALES_MONTH,

    SALES_QTY,

    SALES_AMT,

    CURRENT_SELPRICE,

    SALES_PERC_FPA,

    SALES_AMT_FPA,

    PRICE_LIST,

    PRICE_NET_LIST,

    PRICE_HALF_NET_LIST,

    PRICE_NET_NET_LIST,

    COST_SALES

FROM C:\QVDS\SALES_2016_01.QVD (qvd);

LOAD AutoNumber(ITEM_ID&'|'&STORE_ID&'|'&SALES_DATE) as KEY,

    ITEM,

    AMOUNT_FPA_TRIG,

    AR_EISAGWGIS,

    MODULE

    AMOUNT_TRIG,

    QTY_TRIG,

    TRANS_TYPE,

    WHS_TRIG,

    CUSTOMER,

    AFM_CUST

FROM C:\QVDS\FRANCHISE_SALES_2016_01.QVD (qvd);

I removed the ITEM_ID, STORE_ID, SALES_DATE, SALES_MONTH, and SALES_YEAR from the second table as they can not be referred from your first table and created a KEY field which would join the two tables without creating any synthetic keys.

HTH

Best,

Sunny

Anonymous
Not applicable

When you mention the huge delay to load the second table I reckon that delay is because the second table has fields in it that are not in the first table so the qvd concatenate load is not optimized as it is creating new fields.

To make it optimized you can first create a stub table of zero rows that has all the fields of both tables with something like this :

  

LOAD

null() as ITEM_ID ,

null() as BARCODE_ID ,

etc... for the rest of your fields for all tables

autogenerate(0);

The link explains this all quite well, Turning Unoptimized Loads into Optimized Loads | Qlik Community

Thus you should be able to do fast optimized loads and the concatenation may well result in a faster data model

geogou1973
Creator
Creator
Author

Hello Sunny.

I removed the command AutoNumber because i had more delay with this. I will put all the months from Jan 2015 to Feb 2016 in order to see the differnce.

Thank you for your help

maxgro
MVP
MVP

Bill, this is interesting but I didn't get an optimized load, or I'm missing something?

w: load rowno() as f1, RowNo() as f3 AutoGenerate 100000;

STORE w into w.qvd (qvd);

z: load rowno() as f1, RowNo() as f2 AutoGenerate 100000;

STORE z into z.qvd (qvd);

DROP Table w, z;

a: load null() as f1, null() as f2, null() as f3 AutoGenerate 1;

Concatenate (a) load * From z.qvd (qvd);

Concatenate (a) load * From w.qvd (qvd);

excel << INLB039 3 lines fetched

final << excel 3 lines fetched

w << AUTOGENERATE(100000) 100.000 lines fetched

z << AUTOGENERATE(100000) 100.000 lines fetched

a << AUTOGENERATE(1) 1 lines fetched

a << z 100.001 lines fetched

a << w 200.001 lines fetched

maxgro
MVP
MVP

I think you can have (if you concatenate) one of the 2 tables in optimized load; try to choose an optimized load for the bigger one.

geogou1973
Creator
Creator
Author

Hello Bill.

This is an interesting view. I try to simulate in my data because it is faster and i will see the results.

Thank you for your help

Anonymous
Not applicable

Yup, something does seem adrift there.

I just read this Q-Tip #9 – CONCATENATE or JOIN ? | Natural Synergies,, knocked your script into this :


w:

load

  rand() as f1,

  rand()  as f2 ,

  'W' as Source

AutoGenerate 1000000;

STORE w into w.qvd (qvd);

NoConcatenate

z:

load

  rand() as f1,

  rand()  as f3 ,

  'Z' as Source

AutoGenerate 1000000;

STORE z into z.qvd (qvd);

DROP Table w, z;

a: load * From w.qvd (qvd);

Join (a) load * From z.qvd (qvd);

...and was pleasantly surprised to see this :

w << AUTOGENERATE(1000000) 1,000,000 lines fetched

z << AUTOGENERATE(1000000) 1,000,000 lines fetched

a << w (qvd optimized) 1,000,000 lines fetched

z (qvd optimized) 1,000,000 lines fetched

MarcoWedel

maxgro
MVP
MVP

Yes, both load are optimized but I don't think join and concatenate are the same and you can switch them; it depends....

In the example below (concat and join) when I select a in f1,

- concat: f3, all values grayed  

- join:      f3 has aa possible (white)

xx:

load * inline [

key, f1

1, a

2, b

];

Concatenate (xx) load * inline [

key, f3

1, aa

2, bb

];

2.png

xx:

load * inline [

key, f1

1, a

2, b

];

join (xx) load * inline [

key, f3

1, aa

2, bb

];

1.png