Solved: Re: Preceding Load Performance - Qlik Community

Anonymous · ‎2018-03-20

Hello, i have read

and https://www.quickintelligence.co.uk/preceding-load-qlikview/ but i still have some dubts about preceding load. Is this:

Load ..., ReferenceDate,

Age( ReferenceDate, BirthDate ) as Age;

Load *,

Date( FromDate + IterNo() – 1 ) as ReferenceDate

Resident Policies

While IterNo() <= ToDate - FromDate + 1 ;

Is faster than This:

TmpTable:

Load *,

Date( FromDate + IterNo() – 1 ) as ReferenceDate

Resident Policies

While IterNo() <= ToDate - FromDate + 1 ;

Table:

Load ..., ReferenceDate,

Age( ReferenceDate, BirthDate ) as Age

From TmpTable;

And Why. So It's preceding load doing some kind of shortcut or just a way to to don't have the need to have lots of temp tables when you do calculations like flags etc.

hic · ‎2018-03-20

Preceding Load is indeed a kind of shortcut: Records from the first (bottom) Load are piped into the preceding Load (the upper load), so that only one pass is made through data.

With a resident load from a temp table, multiple passes are made, and this usually takes more time.

HIC

View solution in original post

hic · ‎2018-03-20

Preceding Load is indeed a kind of shortcut: Records from the first (bottom) Load are piped into the preceding Load (the upper load), so that only one pass is made through data.

With a resident load from a temp table, multiple passes are made, and this usually takes more time.

HIC

Anonymous · ‎2018-03-20

Thank you hic‌, I assume then that the time difference is due to save, rename, load and drop and it is not like the time that takes to load one per register is less with the preceding load like happen if you compare the loadtime per register of a QVD file compared to a csv (and maybe a resident ?).

hic · ‎2018-03-20

The time difference is due to the fact that a record needs to be read twice when using a resident load (once in the original load, and once in the resident load). It has nothing to do with the file format.

However, we have had a bug that affected the multi-threading of a preceding load, and made this slower, so you got the "opposite" result. I'm not sure about the status of this bug, but if it is fixed, the preceding load is faster.

HIC

marcus_sommer · ‎2018-03-20

Hi Henric,

are you sure that a preceeding load-chain should be always faster as a chain of resident-loads?

Without a special need to optimize the load-times I prefer the preceeding load-approach because it's easier and provides a better overview but I had had cases in which a resident load-chain was approximately 2 - 3 times faster than the preceeding load-chain. Of course it's a bit comparing apples with pears because I reduced a 5-load-chain into a 2-load-chain but even if I had also reduced the preceeding-chain to maybe a 2/3-load-chain it wouldn't be faster as the resident-chain.

Further I should mention that the loadings contains a lot of fields and dozens of quite heavily nested calculations and the used release was QV 11.2 SR 12.

There is already a good testing to this topic which results in a quite significantely overhead by preceeding loads - is it really caused from a bug or are there further impacts?

The Cost of Preceding Load | Qlikview Cookbook

- Marcus

hic · ‎2018-03-21

Marcus

I am aware of Rob's test that you link to, and Rob's findings are correct. They are however caused by what I consider to be a bug. The bug is that a preceding Load causes the Load to become single-threaded, and thus slower. So if you compare a single-threaded preceding Load with a multi-threaded resident Load, then the latter is faster.

I am unsure about the status of the bug, but once fixed I dare say that a preceding Load (one step in a preceding-Load-chain) should always be faster than a resident Load.

HIC

dionverbeke · ‎2018-03-21

Thanks,

I also have a long outstanding question:

Which is better/faster :

RESIDENT load or

WRITING TO QVD AND OPTIMIZED load?

Dion.

hic · ‎2018-03-21

I would guess that a resident Load is faster, mainly because of the disk access time of the Store command. However, with today's solid state disks, disk access is fast, so I am not sure...

But what's the use case? If you need to load the table once more, then it is probably because you need to transform data in some way, and then the QVD will not be loaded optimized. Or ... ?

HIC

Anonymous · ‎2018-03-22

And why is that the preceding load does not need to read the records twice? I guess that is kind of a way (the only one) to crate a calculated column and read it while reading the same row in order not to do the calculation twice. More or less the same that would happen if existed the function "peek( Calculated_Column, 0 )"

Peter_Cammaert · ‎2018-03-22

Mainly because in a preceding load, the result of the calculations done on each individual row during the first load are kept in a 1-record buffer and this single-record buffer then serves as source for the calculations done during the preceding load before the end-result is stored as a single row in the final table. And this is done record-by-record.

That is probably what Henric describes as "Records from the first (bottom) Load are piped into the preceding Load".

Or as a list of operations:

Open the source table
Read the first record
Perform 1st LOAD operations on this record
Perform 2nd LOAD operations on this record
Store row in target table
Read second record
...