2 Replies Latest reply: Jun 5, 2014 6:30 AM by Adam Major RSS

    Derived dimension from aggregate max values

      Hi,

       

      In my script I have loaded a table of data with each record representing details of an item number. Multiple items can exist on a job. Some of these items are pre-grouped into categories. See below example for 2 jobs (1000 and 1001).

       

      Job NumberItem NoCategoryRevenue
      10002000A100
      10002001A300
      10002002A50
      10002003B400
      10002004B200
      10002005Null150
      10012006A100
      10012007B120
      10012008Null300
      10012009Null450

       

      If an item has a category of Null, I would like to populate it with the category that currently holds the most revenue for that job.

       

      E.g.

      Job 1000:      Category A = 450, Category B = 600 so Item 2005 should have a value of "B" for Category.

      Job 1001:      Category A = 100, Category B = 120 so Items 2008 and 2009 would both have a value of "B" for Category.

       

      I need to do this in the script itself, creating a new field CategoryNew with values populated for all items.

       

      Can anyone help with this?

       

      Thanks,

      Adam

        • Re: Derived dimension from aggregate max values
          Jeremiah Kurpat

          Hello, I used this script:

           

          Data:

          Load * Inline [

          Job Number,Item No,Category,Revenue

          1000,2000,A,100

          1000,2001,A,300

          1000,2002,A,50

          1000,2003,B,400

          1000,2004,B,200

          1000,2005,Null,150

          1001,2006,A,100

          1001,2007,B,120

          1001,2008,Null,300

          1001,2009,Null,450

          ];

           

           

          Testing:

          Load

          [Job Number],

          Category as Cat,

          sum(Revenue) as Sum

          Resident Data

          where Category <> 'Null'

          Group by [Job Number], Category;

           

           

          Testing2:

          Load

          [Job Number],

          firstsortedvalue(Cat, -Sum) as Category2

          Resident Testing

          where Cat <> 'Null'

          Group by [Job Number];

           

           

          Drop Table Testing;

           

           

          LEFT JOIN (Data)

          Load

            [Job Number],

            'Null' as Category,

            Category2

          Resident Testing2;

           

           

          Final:

          Load

          *,

          if(Category='Null', Category2, Category) as CategoryNew

          Resident Data;

           

           

          Drop Table Data;

          Drop table Testing2;

          Drop field Category2 from Final;

           

          Please find attached.

           

          Hope this helps!