Skip to main content
Announcements
Join us at Qlik Connect for 3 magical days of learning, networking,and inspiration! REGISTER TODAY and save!
cancel
Showing results for 
Search instead for 
Did you mean: 
datanibbler
Champion
Champion

LOAD times

Hi,

I have a question rgd. LOAD times for one specific table:

- The table is very big, but on one field it can be split: One part (only one value in that field) has about 5% of all records, the other part (all other values in that field) has about 95% of records.

- There is nów only one filter splitting the table in two: >> TRANCODE = 'SPRECEIVE' << and >> TRANCODE <> 'SPRECEIVE' <<

<=> The first part (LOADed from a qvd file) takes much longer to load than the second one.

These are the two loads: FIRST LOAD:

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>><<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<

  Trans_Dtl_Rec:
LOAD
ITEM_NUMBER as Rec.ITEM_NUMBER,
TRAN_DATE_TD as Rec.TRAN_DATE_TD,
TRAN_DATE_Time as Rec.TRAN_DATE_Time,
TRAN_CODE as Rec.TRAN_CODE,
QUANTITY as Rec.QUANTITY,
BIN as Rec.BIN,
TRACKING_NUMBER as Rec.TRACKING_NUMBER,
CONTROL_NUMBER as Rec.CONTROL_NUMBER,
SERIAL_LOT_NUMBER as Rec.SERIAL_LOT_NUMBER
FROM $(v_Pfad_TransDtl) (qvd)
WHERE TRAN_CODE = 'SPRECEIVE';

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>><<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<

SECOND LOAD

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>><<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<

  Trans_Dtl_aside:
LOAD
ITEM_NUMBER as Aside.ITEM_NUMBER,
TRAN_DATE_TD as Aside.TRAN_DATE_TD,
TRAN_DATE_Time as Aside.TRAN_DATE_Time,
TRAN_CODE as Aside.TRAN_CODE,
QUANTITY as Aside.QUANTITY,
BIN as Aside.BIN,
TRACKING_NUMBER as Aside.TRACKING_NUMBER,
CONTROL_NUMBER as Aside.CONTROL_NUMBER,
SERIAL_LOT_NUMBER as Aside.SERIAL_LOT_NUMBER
FROM $(v_Pfad_TransDtl) (qvd)
WHERE TRAN_CODE <> 'SPRECEIVE';

=> Is there any way I can speed that up?

Thanks a lot!

Best regards,

DataNibbler

1 Solution

Accepted Solutions
marcus_sommer

Hi DataNibbler,

at first I would check if it's really really needed to split this into two tables - of course you will have a reason but there might be other ways to solve this task.

If you want to speed up a qvd-load you must be to ensure that the load runs optimized. You could reach this by using a where exists() statement. Try something like this:

Dummy:

Load 'SPRECEIVE' as TRAN_CODE autogenerate 1;

Trans_Dtl_Rec:
LOAD
ITEM_NUMBER as Rec.ITEM_NUMBER,
TRAN_DATE_TD as Rec.TRAN_DATE_TD,
TRAN_DATE_Time as Rec.TRAN_DATE_Time,
TRAN_CODE as Rec.TRAN_CODE,
QUANTITY as Rec.QUANTITY,
BIN as Rec.BIN,
TRACKING_NUMBER as Rec.TRACKING_NUMBER,
CONTROL_NUMBER as Rec.CONTROL_NUMBER,
SERIAL_LOT_NUMBER as Rec.SERIAL_LOT_NUMBER
FROM $(v_Pfad_TransDtl) (qvd)
WHERE exists(TRAN_CODE)';


  Trans_Dtl_aside:

LOAD
ITEM_NUMBER as Aside.ITEM_NUMBER,
TRAN_DATE_TD as Aside.TRAN_DATE_TD,
TRAN_DATE_Time as Aside.TRAN_DATE_Time,
TRAN_CODE as Aside.TRAN_CODE,
QUANTITY as Aside.QUANTITY,
BIN as Aside.BIN,
TRACKING_NUMBER as Aside.TRACKING_NUMBER,
CONTROL_NUMBER as Aside.CONTROL_NUMBER,
SERIAL_LOT_NUMBER as Aside.SERIAL_LOT_NUMBER
FROM $(v_Pfad_TransDtl) (qvd)

WHERE not exists(TRAN_CODE)';


drop table Dummy;

- Marcus

View solution in original post

30 Replies
marcus_sommer

Hi DataNibbler,

at first I would check if it's really really needed to split this into two tables - of course you will have a reason but there might be other ways to solve this task.

If you want to speed up a qvd-load you must be to ensure that the load runs optimized. You could reach this by using a where exists() statement. Try something like this:

Dummy:

Load 'SPRECEIVE' as TRAN_CODE autogenerate 1;

Trans_Dtl_Rec:
LOAD
ITEM_NUMBER as Rec.ITEM_NUMBER,
TRAN_DATE_TD as Rec.TRAN_DATE_TD,
TRAN_DATE_Time as Rec.TRAN_DATE_Time,
TRAN_CODE as Rec.TRAN_CODE,
QUANTITY as Rec.QUANTITY,
BIN as Rec.BIN,
TRACKING_NUMBER as Rec.TRACKING_NUMBER,
CONTROL_NUMBER as Rec.CONTROL_NUMBER,
SERIAL_LOT_NUMBER as Rec.SERIAL_LOT_NUMBER
FROM $(v_Pfad_TransDtl) (qvd)
WHERE exists(TRAN_CODE)';


  Trans_Dtl_aside:

LOAD
ITEM_NUMBER as Aside.ITEM_NUMBER,
TRAN_DATE_TD as Aside.TRAN_DATE_TD,
TRAN_DATE_Time as Aside.TRAN_DATE_Time,
TRAN_CODE as Aside.TRAN_CODE,
QUANTITY as Aside.QUANTITY,
BIN as Aside.BIN,
TRACKING_NUMBER as Aside.TRACKING_NUMBER,
CONTROL_NUMBER as Aside.CONTROL_NUMBER,
SERIAL_LOT_NUMBER as Aside.SERIAL_LOT_NUMBER
FROM $(v_Pfad_TransDtl) (qvd)

WHERE not exists(TRAN_CODE)';


drop table Dummy;

- Marcus

ganeshsvm
Creator II
Creator II

Try to load using Where Exist().

Like this,

TEMP;

Load * Inline [

TRAN_CODE

SPRECEIVE

];

TRANS_DTL_REC:

LOAD

......

FROM .....(QVD)

WHERE Exists(TRAN_CODE);

TRANS_DTL_REC:

LOAD

......

FROM .....(QVD)

WHERE Not Exists(TRAN_CODE);

Drop table TEMP;

This will save some time.

datanibbler
Champion
Champion
Author

Hi,

thanks a lot! I will try.

Yes, the splitting of the table does have a reason - I need the respective parts several times during the script execution, so to make all these LOADs RESIDENT, I load the table beforehand, and since in some places I need only the smaller part, in others I need only the bigger part, it makes sense as both parts are smaller to parse than the total ...

This still does not explain why the smaller part takes much longer in the loading than the big one - but never mind. If I can speed the whole thing up, all the better.

Now I have introduced a partial_reload_scheme, in case the report is used several times one after another, so that these tables are just left standing instead of being reloaded every time.

What I'd like to do next is, building on top of this, to make the script decide itself when it makes sense to reload that big table - once an hour, when it has been updated from the database - and otherwise, it should always perform a partial reload - or I could do it otherwise - but I will make that a new thread in order to not confuse things here.

Best regards,

DataNibbler

datanibbler
Champion
Champion
Author

Hi,

the LOADs still do not run in optimized mode and I cannot spot why:

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>><<<<<<<<<<<<<<<<<<<<<<<<<<<<<<

Temp:
LOAD 'SPRECEIVE' as TranCode2look4 AutoGenerate 1;

Trans_Dtl_Rec:
LOAD
ITEM_NUMBER as Rec.ITEM_NUMBER,
TRAN_DATE_TD as Rec.TRAN_DATE_TD,
TRAN_DATE_Time as Rec.TRAN_DATE_Time,
TRAN_CODE as Rec.TRAN_CODE,
QUANTITY as Rec.QUANTITY,
BIN as Rec.BIN,
TRACKING_NUMBER as Rec.TRACKING_NUMBER,
CONTROL_NUMBER as Rec.CONTROL_NUMBER,
SERIAL_LOT_NUMBER as Rec.SERIAL_LOT_NUMBER
FROM $(v_Pfad_TransDtl) (qvd)
WHERE EXISTS(TranCode2look4, TRAN_CODE);

marcus_sommer

Hi DataNibbler,

only the single-parameter version of the exists(TRAN_CODE) statement will be optimized. As far as you put a second parameter into it you forced qlik to process this load instead of transferring it directly into the RAM.

Although the approach of using where exists() worked very well you could be forced to adjust your load-approaches a bit to be able to use the single-parameter version in cases where this field is already loaded within previous loadings.

One approach is to create a temp-field for it within the load which creates the qvd and another would be to rename the fields like:

rename field TRAN_CODE into TRAN_CODE_TEMP;

Temp:
LOAD 'SPRECEIVE' as TRAN_CODE AutoGenerate 1;

Trans_Dtl_Rec:
LOAD
ITEM_NUMBER as Rec.ITEM_NUMBER,
TRAN_DATE_TD as Rec.TRAN_DATE_TD,
TRAN_DATE_Time as Rec.TRAN_DATE_Time,
TRAN_CODE as Rec.TRAN_CODE,
QUANTITY as Rec.QUANTITY,
BIN as Rec.BIN,
TRACKING_NUMBER as Rec.TRACKING_NUMBER,
CONTROL_NUMBER as Rec.CONTROL_NUMBER,
SERIAL_LOT_NUMBER as Rec.SERIAL_LOT_NUMBER
FROM $(v_Pfad_TransDtl) (qvd)
WHERE EXISTS(TRAN_CODE);

rename field TRAN_CODE_TEMP into TRAN_CODE;

- Marcus

datanibbler
Champion
Champion
Author

Hi Marcus,

I cannot quite follow the execution_order of your code - where does this RENAME statement come in?

marcus_sommer

Hi DataNibbler,

if your field is already loaded and maybe therefore not be usable for the where-exists-logic you would rename it directly before you want to load your optimized qvd's and afterwards you might want/need to rename it again - it will be depending on your datamodel.

Of course this kind of scripting adds some more overhead and complexity but it's the price you need to pay to get optimized loadings.

Here are various quite explanatory links included to the exists-topic which will help you to understand how exists worked and how it could be implemented: Advanced topics for creating a qlik datamodel.

- Marcus

datanibbler
Champion
Champion
Author

Thanks Marcus!

I will have a look.

datanibbler
Champion
Champion
Author

Hi Marcus,

I think I have read all of the articles on the issue. I know what the issue is - but I still cannot imagine how to avoid it. The confusing thing about your post (your example) is that those RENAME commands come as the very first and last statements, around everything else, not within the LOAD, so I don't know what they are supposed to do - although I think I remember I have come across this issue before ...

Sorry. Can you give me one more hint?

Thanks a lot!

Best regards,

DataNibbler