7 Replies Latest reply: May 5, 2015 8:19 AM by Philip Munz RSS

    Question to slow mapping load

    Philip Munz

      Dear experts,

       

      question: can anyone explain why

      "mapping load resident where...  "   on big tables is much slower splitting the funcionality in two Statements:

      "load * resident where... " and "mapping load resident <on extract>".

       

      Here the question more detailed:

       

      having some problems with a slow datamodel I experienced the following:

      Table TMP_MESS: 32 Mio Rows (~1GB qvd-File)

       

      The following statement takes about 7.5 minutes:

      mapping load MEASUREMENT_ID, MEASUREMENT_VALUE resident TMP_MESS where DISTINCTIVENESS = 'GESAMT'

       

      But making two Statements out of it takes not even 1 Minute:

      TMPMESSTEST:

      load MEASUREMENT_ID, MEASUREMENT_VALUE resident TMP_MESS where DISTINCTIVENESS = 'GESAMT'

      mapping load MEASUREMENT_ID, MEASUREMENT_VALUE resident TMPMESSTEST

       

      here are the according lines from the log:

       

      30.04.2015 07:29:31:        17 fields found: %DATE, %MATCHCODE, %LOT, MEASUREMENT_TIME, MP_NAME, MEASUREMENT_ID, LOT, WAFER, STAGE, PARAMETER, DISTINCTIVENESS, Prober, Tester, Nadelkarte, Pinelek, MEASUREMENT_VALUE, IS_PASSED, 31,990,752 lines fetched

      30.04.2015 07:30:40: 0037  TMPMESSTEST:

      30.04.2015 07:30:40: 0038  load MEASUREMENT_ID, MEASUREMENT_VALUE resident TMP_MESS where DISTINCTIVENESS = 'GESAMT'

      30.04.2015 07:30:40:        2 fields found: MEASUREMENT_ID, MEASUREMENT_VALUE, 539,562 lines fetched

      30.04.2015 07:31:17: 0040  mapMessIDGES:

      30.04.2015 07:31:17: 0041  mapping load MEASUREMENT_ID, MEASUREMENT_VALUE resident TMPMESSTEST

      30.04.2015 07:31:17:        2 fields found: MEASUREMENT_ID, MEASUREMENT_VALUE, 539,562 lines fetched

      30.04.2015 07:31:20: 0043  mapMessTest2:

      30.04.2015 07:31:20: 0044  mapping load MEASUREMENT_ID, MEASUREMENT_VALUE resident TMP_MESS where DISTINCTIVENESS = 'GESAMT'

      30.04.2015 07:31:20:        2 fields found: MEASUREMENT_ID, MEASUREMENT_VALUE, 539,562 lines fetched

      30.04.2015 07:38:55: 0046  drop table  TMPMESSTEST

      30.04.2015 07:38:56: 0048  QUALIFY *

       

      Has anyone experienced the same?

      Are there any rules or advices when to use one statement and when to spilt into two Statements?

       

      Kind regards

      Philip

        • Re: Question to slow mapping load
          Maxim Senin

          Hi,

           

          I didn't see a rule and a advice on this and it's very curious to get to know this.

           

          It's very similar (in sense of a speed) to a case of optimized/unoptimized load from qvd but here you deal with in-memory table.

           

          Best regards,

          Maxim

          • Re: Question to slow mapping load
            Daniel Oberbillig

            Seems as if Qlik suffers when doing an input-comparison (where...) and an output comparison (mapping...) in the same Load process. Would be indeed interesting if this is a general rule/learning.

            • Re: Question to slow mapping load
              Marcus Sommer

              I think the point from danieloberbillig goes in the right direction and in the two-step scenario the first filter-load runs in a kind of optimized load and the second mapping-load had only a few records to fetch. And within the first scenario both actions (filter + mapping) will be probably performed on record-level.

               

              But it would be very interessting to have a deeper insight how it technically worked, maybe hic could give some details.

               

              - Marcus

                • Re: Question to slow mapping load
                  Maxim Senin

                  Hi,

                   

                  Sounds good and if mapping works before "where" filtering that would be an answer.

                   

                  Did you compare time of single mapping+load without "where" and separate load without "where" + mapping?

                   

                  If the assumption is correct, their time would be the same or almost the same.

                   

                  Best regards,

                  Maxim

                    • Re: Question to slow mapping load
                      Marcus Sommer

                      I haven't it directly compared by a mapping-table. But I have had recently a similar case with a group by and where-clause in a loop and the differences in load-times was enormous. I had the best results by separating the statements.

                       

                      - Marcus

                        • Re: Question to slow mapping load
                          Peter Cammaert

                          An explanation could be (just a guess) that the MAPPING LOAD RESIDENT takes an entire table (first two columns of course but that doesn't make a difference since for example an optimized load of an entire QVD stays optimized if you load only specific columns) and converts it into sort of symbol table (that's what is is eventually and what makes an applymap() awfully quick) almost immediately, while a MAPPING LOAD RESIDENT WHERE does the two main operations at the same time instead of like sort of a PRECEDING LOAD. Inserting entries row by row into a symbol table is apparently much slower than converting an entire table straight into a symbol table.

                           

                          Very curious if Qlik technicians can shed some light on this.

                           

                          Peter

                    • Re: Question to slow mapping load
                      Philip Munz

                      Dear all,

                       

                      whether Qlik technicians will give a comment on this or not - I learned much from your comments.

                      Thanks a lot!

                      As a summary for me:

                      Both approaches (lingle statment or two statements) are valid. But when having problems with performance this is a good point to have a keen look on.

                       

                      Kind regards

                      Philip