Avoid synthetic keys

geogou1973 · ‎2016-02-21

Hello.

I have the following script. If i reload the script as below

LOAD ITEM_ID,

BARCODE_ID,

BARCODE,

STORE_ID,

SALES_DATE,

SALES_WEEK,

SALES_WEEKDAY,

SALES_YEAR,

SALES_MONTH,

SALES_QTY,

SALES_AMT,

CURRENT_SELPRICE,

SALES_PERC_FPA,

SALES_AMT_FPA,

PRICE_LIST,

PRICE_NET_LIST,

PRICE_HALF_NET_LIST,

PRICE_NET_NET_LIST,

COST_SALES

FROM C:\QVDS\SALES_2016_01.QVD (qvd);

LOAD ITEM,

AMOUNT_FPA_TRIG,

AR_EISAGWGIS,

MODULE,

SALES_YEAR,

SALES_MONTH,

SALES_DATE,

AMOUNT_TRIG,

QTY_TRIG,

TRANS_TYPE,

WHS_TRIG,

CUSTOMER,

AFM_CUST,

STORE_ID,

ITEM_ID

FROM C:\QVDS\FRANCHISE_SALES_2016_01.QVD (qvd);

i have the following image in the table view.

If i concatenate the two tables i avoid the synthetic keys but i have i huge delay to load the second table.

The two tables have more than 25.000.000 records for one month.

Is there any other way to avoid the synthetic keys and not have the delay of loading the second table ?

Thank you in advance.

sunny_talwar · ‎2016-02-21

You can try this if you want to remove the synthetic key:

LOAD AutoNumber(ITEM_ID&'|'&STORE_ID&'|'&SALES_DATE) as KEY,

ITEM_ID,

BARCODE_ID,

BARCODE,

STORE_ID,

SALES_DATE,

SALES_WEEK,

SALES_WEEKDAY,

SALES_YEAR,

SALES_MONTH,

SALES_QTY,

SALES_AMT,

CURRENT_SELPRICE,

SALES_PERC_FPA,

SALES_AMT_FPA,

PRICE_LIST,

PRICE_NET_LIST,

PRICE_HALF_NET_LIST,

PRICE_NET_NET_LIST,

COST_SALES

FROM C:\QVDS\SALES_2016_01.QVD (qvd);

LOAD AutoNumber(ITEM_ID&'|'&STORE_ID&'|'&SALES_DATE) as KEY,

ITEM,

AMOUNT_FPA_TRIG,

AR_EISAGWGIS,

MODULE

AMOUNT_TRIG,

QTY_TRIG,

TRANS_TYPE,

WHS_TRIG,

CUSTOMER,

AFM_CUST

FROM C:\QVDS\FRANCHISE_SALES_2016_01.QVD (qvd);

I removed the ITEM_ID, STORE_ID, SALES_DATE, SALES_MONTH, and SALES_YEAR from the second table as they can not be referred from your first table and created a KEY field which would join the two tables without creating any synthetic keys.

HTH

Best,

Sunny

Anonymous · ‎2016-02-21

When you mention the huge delay to load the second table I reckon that delay is because the second table has fields in it that are not in the first table so the qvd concatenate load is not optimized as it is creating new fields.

To make it optimized you can first create a stub table of zero rows that has all the fields of both tables with something like this :

LOAD

null() as ITEM_ID ,

null() as BARCODE_ID ,

etc... for the rest of your fields for all tables

autogenerate(0);

The link explains this all quite well, Turning Unoptimized Loads into Optimized Loads | Qlik Community

Thus you should be able to do fast optimized loads and the concatenation may well result in a faster data model

geogou1973 · ‎2016-02-21

Hello Sunny.

I removed the command AutoNumber because i had more delay with this. I will put all the months from Jan 2015 to Feb 2016 in order to see the differnce.

Thank you for your help

maxgro · ‎2016-02-21

Bill, this is interesting but I didn't get an optimized load, or I'm missing something?

w: load rowno() as f1, RowNo() as f3 AutoGenerate 100000;

STORE w into w.qvd (qvd);

z: load rowno() as f1, RowNo() as f2 AutoGenerate 100000;

STORE z into z.qvd (qvd);

DROP Table w, z;

a: load null() as f1, null() as f2, null() as f3 AutoGenerate 1;

Concatenate (a) load * From z.qvd (qvd);

Concatenate (a) load * From w.qvd (qvd);

excel << INLB039 3 lines fetched

final << excel 3 lines fetched

w << AUTOGENERATE(100000) 100.000 lines fetched

z << AUTOGENERATE(100000) 100.000 lines fetched

a << AUTOGENERATE(1) 1 lines fetched

a << z 100.001 lines fetched

a << w 200.001 lines fetched

maxgro · ‎2016-02-21

I think you can have (if you concatenate) one of the 2 tables in optimized load; try to choose an optimized load for the bigger one.

geogou1973 · ‎2016-02-21

Hello Bill.

This is an interesting view. I try to simulate in my data because it is faster and i will see the results.

Thank you for your help

Anonymous · ‎2016-02-21

Yup, something does seem adrift there.

I just read this Q-Tip #9 – CONCATENATE or JOIN ? | Natural Synergies,, knocked your script into this :

w:

load

rand() as f1,

rand() as f2 ,

'W' as Source

AutoGenerate 1000000;

STORE w into w.qvd (qvd);

NoConcatenate

z:

load

rand() as f1,

rand() as f3 ,

'Z' as Source

AutoGenerate 1000000;

STORE z into z.qvd (qvd);

DROP Table w, z;

a: load * From w.qvd (qvd);

Join (a) load * From z.qvd (qvd);

...and was pleasantly surprised to see this :

w << AUTOGENERATE(1000000) 1,000,000 lines fetched

z << AUTOGENERATE(1000000) 1,000,000 lines fetched

a << w (qvd optimized) 1,000,000 lines fetched

z (qvd optimized) 1,000,000 lines fetched

MarcoWedel · ‎2016-02-21

maybe helpful:

Should We Stop Worrying and Love the Synthetic Key?

regards

Marco

maxgro · ‎2016-02-21

Yes, both load are optimized but I don't think join and concatenate are the same and you can switch them; it depends....

In the example below (concat and join) when I select a in f1,

- concat: f3, all values grayed

- join: f3 has aa possible (white)

xx:

load * inline [

key, f1

1, a

2, b

];

Concatenate (xx) load * inline [

key, f3

1, aa

2, bb

];