16 Replies Latest reply: Feb 3, 2016 9:58 PM by Yomna Touni RSS

    Group By multiple fields

      Hi all,

      I have source data which has multiple lines for the one primary key.

      It goes like this

      table EDW:

      Property Reference  | SOR Code | Work Programme Bk  | Location  | Total Cost

      1                                   kit01                  LCQ                       kitchen           100

      1                                   kit03                  LCQ                       kitchen            120

      1                                   kit04                  PR3                       kitchen             140

      1                                   pgi01                 LCQ                       bedroom          95

      1                                   pgi03                 LCQ                       bedroom          150

      1                                   plu40                 SOW                      bath                  400

      2                                   kit50                   LCQ                      kitchen             200

      2                                   plu05                 LCQ                       bath                  400

      2                                   ele30                  LCQ                      WC                    350

      2                                    min95                 RSW                     HALL                 40

       

       

      what i'm trying to do now is figure out the sum of the total cost of all LCQ items for anything starting with 'kit'

      the following is what I have written so far, but the sum works out to be something completely different.

      I would really appreciate a little guidance

       

      LOAD

      [Property Reference],

      SUM(IF(([Total Cost] > 0) AND ([SOR Code] like 'KIT*) AND ([Work Programme Bk] like 'LCQ'), [Total Cost], 0] as KIT_Total_LCQ_Cost

       

      Resident EDW

      Group by [Property Reference] and [Work Programme Bk] and [SOR Code];

       

      When I run this function I keep getting that its an "invalid expression"

       

      Could someone shed a bit of light on where I might be wrong?

       

      Regards,

       

      Yomna

        • Re: Group By multiple fields

          what I also forgot to mention is the fact that when I remove the last 2 fields in the group by line, and just say

          group by [Property Reference] ;

           

          this is where I get a totally different sum to what I expected.

          • Re: Group By multiple fields
            Stefan Wühl

            If you want to group your sum by Property Reference, try

             

            LOAD

            [Property Reference],

            SUM( [Total Cost] ) as KIT_Total_LCQ_Cost

            Resident EDW

            WHERE [Work Programme Bk] = 'LCQ'  AND [SOR Code] LIKE 'kit*' AND [Total Cost] > 0

            Group by [Property Reference];

            • Re: Group By multiple fields
              jagan mohan rao appala

              Hi,

              Use below expressions

               

              LOAD

              [Property Reference],

              SUM(IF(([Total Cost] > 0) AND (WildMatch([SOR Code], 'kit*') AND (Upper([Work Programme Bk]) = 'LCQ'), [Total Cost], 0] as KIT_Total_LCQ_Cost

               

              Resident EDW

              Group by [Property Reference], [Work Programme Bk] ,  [SOR Code];

               

              OR just load the required values by using where conditions

               

              LOAD

              [Property Reference],

              SUM([Total Cost] ) as KIT_Total_LCQ_Cost

              Resident EDW

              WHERE [Total Cost] > 0 AND WildMatch([SOR Code], 'kit*') AND Upper([Work Programme Bk]) = 'LCQ'

              Group by [Property Reference];

               

              Hope this helps you.

               

              Regards,

              jagan.

              • Re: Group By multiple fields
                Settu Periyasamy

                Hi,

                In your script, there is missing the Single Quote and Closed parenthesis...

                LOAD

                [Property Reference],

                SUM(IF(([Total Cost] > 0) AND ([SOR Code] like 'KIT*') AND ([Work Programme Bk] like 'LCQ'), [Total Cost], 0] ) ) as KIT_Total_LCQ_Cost

                Resident EDW

                Group by [Property Reference] and [Work Programme Bk] and [SOR Code];

                Now, it seems working..

                EDW:
                LOAD * INLINE [
                    Property Reference  ,  SOR Code ,  Work Programme Bk  ,  Location  ,  Total Cost
                    1, kit01, LCQ, kitchen, 100
                    1, kit03, LCQ, kitchen, 120
                    1, kit04, PR3, kitchen, 140
                    1, pgi01, LCQ, bedroom, 95
                    1, pgi03, LCQ, bedroom, 150
                    1, plu40, SOW, bath, 400
                    2, kit50, LCQ, kitchen, 200
                    2, plu05, LCQ, bath, 400
                    2, ele30, LCQ, WC, 350
                    2, min95, RSW, HALL, 40
                ];
                
                New:
                LOAD
                [Property Reference],
                SUM(IF( ([Total Cost] > 0) AND ([SOR Code] like 'KIT*') AND ([Work Programme Bk] like 'LCQ'), [Total Cost], 0))as KIT_Total_LCQ_Cost
                Resident EDW
                Group by [Property Reference];
                
                
                

                 

                Capture.JPG