Skip to main content
Announcements
Qlik Connect 2024! Seize endless possibilities! LEARN MORE
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

Question to slow mapping load

Dear experts,

question: can anyone explain why

"mapping load resident where...  "   on big tables is much slower splitting the funcionality in two Statements:

"load * resident where... " and "mapping load resident <on extract>".

Here the question more detailed:

having some problems with a slow datamodel I experienced the following:

Table TMP_MESS: 32 Mio Rows (~1GB qvd-File)

The following statement takes about 7.5 minutes:

mapping load MEASUREMENT_ID, MEASUREMENT_VALUE resident TMP_MESS where DISTINCTIVENESS = 'GESAMT'

But making two Statements out of it takes not even 1 Minute:

TMPMESSTEST:

load MEASUREMENT_ID, MEASUREMENT_VALUE resident TMP_MESS where DISTINCTIVENESS = 'GESAMT'

mapping load MEASUREMENT_ID, MEASUREMENT_VALUE resident TMPMESSTEST

here are the according lines from the log:

30.04.2015 07:29:31:        17 fields found: %DATE, %MATCHCODE, %LOT, MEASUREMENT_TIME, MP_NAME, MEASUREMENT_ID, LOT, WAFER, STAGE, PARAMETER, DISTINCTIVENESS, Prober, Tester, Nadelkarte, Pinelek, MEASUREMENT_VALUE, IS_PASSED, 31,990,752 lines fetched

30.04.2015 07:30:40: 0037  TMPMESSTEST:

30.04.2015 07:30:40: 0038  load MEASUREMENT_ID, MEASUREMENT_VALUE resident TMP_MESS where DISTINCTIVENESS = 'GESAMT'

30.04.2015 07:30:40:        2 fields found: MEASUREMENT_ID, MEASUREMENT_VALUE, 539,562 lines fetched

30.04.2015 07:31:17: 0040  mapMessIDGES:

30.04.2015 07:31:17: 0041  mapping load MEASUREMENT_ID, MEASUREMENT_VALUE resident TMPMESSTEST

30.04.2015 07:31:17:        2 fields found: MEASUREMENT_ID, MEASUREMENT_VALUE, 539,562 lines fetched

30.04.2015 07:31:20: 0043  mapMessTest2:

30.04.2015 07:31:20: 0044  mapping load MEASUREMENT_ID, MEASUREMENT_VALUE resident TMP_MESS where DISTINCTIVENESS = 'GESAMT'

30.04.2015 07:31:20:        2 fields found: MEASUREMENT_ID, MEASUREMENT_VALUE, 539,562 lines fetched

30.04.2015 07:38:55: 0046  drop table  TMPMESSTEST

30.04.2015 07:38:56: 0048  QUALIFY *

Has anyone experienced the same?

Are there any rules or advices when to use one statement and when to spilt into two Statements?

Kind regards

Philip

7 Replies
maksim_senin
Partner - Creator III
Partner - Creator III

Hi,

I didn't see a rule and a advice on this and it's very curious to get to know this.

It's very similar (in sense of a speed) to a case of optimized/unoptimized load from qvd but here you deal with in-memory table.

Best regards,

Maxim

danieloberbilli
Specialist II
Specialist II

Seems as if Qlik suffers when doing an input-comparison (where...) and an output comparison (mapping...) in the same Load process. Would be indeed interesting if this is a general rule/learning.

marcus_sommer

I think the point from danieloberbillig goes in the right direction and in the two-step scenario the first filter-load runs in a kind of optimized load and the second mapping-load had only a few records to fetch. And within the first scenario both actions (filter + mapping) will be probably performed on record-level.

But it would be very interessting to have a deeper insight how it technically worked, maybe hic could give some details.

- Marcus

maksim_senin
Partner - Creator III
Partner - Creator III

Hi,

Sounds good and if mapping works before "where" filtering that would be an answer.

Did you compare time of single mapping+load without "where" and separate load without "where" + mapping?

If the assumption is correct, their time would be the same or almost the same.

Best regards,

Maxim

marcus_sommer

I haven't it directly compared by a mapping-table. But I have had recently a similar case with a group by and where-clause in a loop and the differences in load-times was enormous. I had the best results by separating the statements.

- Marcus

Peter_Cammaert
Partner - Champion III
Partner - Champion III

An explanation could be (just a guess) that the MAPPING LOAD RESIDENT takes an entire table (first two columns of course but that doesn't make a difference since for example an optimized load of an entire QVD stays optimized if you load only specific columns) and converts it into sort of symbol table (that's what is is eventually and what makes an applymap() awfully quick) almost immediately, while a MAPPING LOAD RESIDENT WHERE does the two main operations at the same time instead of like sort of a PRECEDING LOAD. Inserting entries row by row into a symbol table is apparently much slower than converting an entire table straight into a symbol table.

Very curious if Qlik technicians can shed some light on this.

Peter

Not applicable
Author

Dear all,

whether Qlik technicians will give a comment on this or not - I learned much from your comments.

Thanks a lot!

As a summary for me:

Both approaches (lingle statment or two statements) are valid. But when having problems with performance this is a good point to have a keen look on.

Kind regards

Philip