4 Replies Latest reply: Feb 5, 2017 4:39 PM by Stefan Wühl RSS

    Preceding load with group by

    Bill Markham

      Using this to create a test Data table :

       

      Data :

      LOAD

        Date(Date#(Date,'DD-MMM-YYYY')) as Date ,

        Site

      ;

      Load *

      INLINE [

          Date, Site

          01-May-2015, Site 1

          02-May-2015, Site 2

          03-May-2015, Site 2

          04-May-2015, Site 1

          05-May-2015, Site 1

          06-May-2015, Site 2

          07-May-2015, Site 1

          08-May-2015, Site 1

      ];

       

      ===============================

       

      I then run this :

       

      Temp01 :

      load

        peek ( Site ) ,

        num(if ( Site = peek ( Site ) , peek( [Sub] ) , alt ( peek( [Sub] ) + 1  , 1 ) ) ) as [Sub] ,

        *

      resident Data

      order by Date , Site

      ;

       

      Temp02 :

      NoConcatenate

      load

        date(Min(Date)) as [Start Date] ,

        date(Max(Date)) as [End Date] ,

        Sub ,

        only(Site) as Site

      resident Temp01

      group by Sub

      ;

       

      drop table Temp01 ;

      drop table Data ;

       

      I get this in Temp02 table, which is what I want :

       

      Sub Site Start Date End Date
      1Site 101 May 201501 May 2015
      2Site 202 May 201503 May 2015
      3Site 104 May 201505 May 2015
      4Site 206 May 201506 May 2015
      5Site 107 May 201508 May 2015

       

       

      ===============================

       

      But to eliminate the second resident load I wish to script this in a preceding load, something like this :

       

      Temp01 :

      load

        date(Min(Date)) as [Start Date] ,

        date(Max(Date)) as [End Date] ,

        Sub ,

        only(Site) as Site

      group by Sub

      ;

      load

        peek ( Site ) ,

        num(if ( Site = peek ( Site ) , peek( [Sub] ) , alt ( peek( [Sub] ) + 1  , 1 ) ) ) as [Sub] ,

        *

      resident Data

      order by Date , Site

      ;

       

      drop table Data ;

       

       

      But then I get this output table, which is wrong.

       

      Sub Site Start Date End Date
      1Site 202 May 201506 May 2015
      Site 101 May 201508 May 2015

       

      ===============================

       

      Anybody got any idea where it is going adrift ?

          • Re: Preceding load with group by
            Peter Cammaert

            Bill, I haven't got a clue. But from my own attempts a few years ago I remember a similar situation that couldn't be explained. I then ran into Qlik support, and after a few WADs I simply gave up on using PRECEDING LOADS (except for the very simple ones) which essentially are nothing more than script-text-shortening techniques. Too much trouble, too unreliable.

             

            In the end, in your case the exact same thing will be happening behind the scenes as in your original script: Resident tables are created to pass data from one PRECEDING LOAD to the next. A GROUP BY and an ORDER BY cannot be performed "on-the-fly"...

             

            I did manage to behave (i.e. not comment) when reading this blog post: Preceding Load

             

            P.

            • Re: Preceding load with group by
              Tresesco B

              I guess that is because the preceding load Group By is also being considered by it's following resident load. And since, the [Sub] field is a calculated field from [Site], the group by is being implemented on Site to give you the output as it comes.

                • Re: Preceding load with group by
                  Stefan Wühl

                  I think it's a little bit different (or I would describe it a little bit different):

                   

                  Your method to create an incremental number for a Site change is relying on Peek() function to address the un-goruped records, but I believe Peek() will address the records from the ouput table (which are the records that the top most preceding LOAD produces, the grouped records in your case). I think there is no method to address the intermediate records on their way from the bottom to top LOAD statement.

                   

                  As Peter mentioned - and looking at the reports about preceding LOAD resulting in worse performance - I would also vote for your first script version.

                   

                  Hope this makes sense,

                  Stefan

                   

                  edit:

                  sorry, I've just noticed that this thread is years old, it seemed to just popped up again...