2 Replies Latest reply: Jan 23, 2018 10:18 AM by Erica Whalley RSS

    Load only row with max data

    Neha Saxena

      Hello Experts,

       

      I have been trying to load only the max date record with the following script -

       

      [Twitter]:

      LOAD [company_name],

      [positive_percent],

      [negative_percent],

      [neutral_percent],

      [load_date]

      FROM [lib://Desktop/Twitter.xlsx]

      (ooxml, embedded labels, table is Sheet1);

       

      Right Join (Twitter)

      LOAD [company_name],

      Max([load_date]) as Date

      Resident Twitter

      Group by [company_name];

       

      After reload I am still getting all the rows (attached image)

       

      I am kind of stuck here. Is it because I have timestamp in my load_date field ?

       

      Please guide me.

      Best Regards,

      Neha

        • Re: Load only row with max data
          Sunny Talwar

          Try this

           

          [Twitter]:

          LOAD [company_name],

          [positive_percent],

          [negative_percent],

          [neutral_percent],

          [load_date]

          FROM [lib://Desktop/Twitter.xlsx]

          (ooxml, embedded labels, table is Sheet1);

           

          Right Join (Twitter)

          LOAD [company_name],

          Max([load_date]) as [load_date]

          Resident Twitter

          Group by [company_name];

          • Re: Load only row with max data
            Erica Whalley

            Hi, I think you're almost there.

             

            I would make sure you format the date in the original table and the date in the right join the same. I imagine the date is coming through like you said as a timestamp, and the max(...) is returning a number.

             

            Eg

            LOAD

            ...

            timestamp(Load_Date) as Load_Date

            ...

            from [...];

             

            right join (xxx)

            Load

            company_name,

            timestamp(max(Load_Date)) as Load_Date

            REsident.....

             

            The other thing you could try is the firstSortedValue() function, but you would need to repeat it for every field. This returns the first value, for a field in a specified sort order. You would need to group by company name, as this is an aggregated function.

             

            EG

             

            Load

            Company_Name,

            firstsortedvalue([positive_percent],Load_date) as positive_percent,

            firstsortedvalue([negative_percent],Load_date) as negative_percent,

            ...

            FROM [lib://Desktop/Twitter.xlsx]

            GROUP BY Company_Name;

             

            I prefer the "right join" solution though!

             

            Erica