4 Replies Latest reply: Feb 4, 2018 7:53 AM by igor gois RSS

    reduce table rows with date_from and date_to columns

    igor gois

      hello,

       

       

      i have a huge table (3 billions rows) with the cost of product in each store per day.

       

      the cost doesnt change frequently

       

      I want to reduce the number of rows os this table using 2 columns of date interval (date_from and date_to) instead one row per day

       

      original table:  

      datestorematerialcost
      01/01/2018a12310
      02/01/2018a12310
      03/01/2018a123100
      04/01/2018a12310
      05/01/2018a123200
      06/01/2018a123200

       

       

      result table:

        

      storematerialcostdate_fromdate_to
      a1231001/01/201802/01/2018
      a12310003/01/201803/01/2018
      a1231004/01/201804/01/2018
      a12320005/01/201806/01/2018

       

      i tried to use group by with min(date) and max(date) but it didnt work (see the case of cost 10 in the example above, I got from 01/01 from 04/01)

       

      Im also trying something with one incremental auxiliar column using previous but without success

       

        

      datestorematerialcostaux
      01/01/2018a123101
      02/01/2018a123101
      03/01/2018a1231002
      04/01/2018a123103
      05/01/2018a1232004
      06/01/2018a1232004

       

      does anyone can help me?

       

      thanks in advance

        • Re: reduce table rows with date_from and date_to columns
          Kaushik Solanki

          Try this,

           

          Data:

          Load *,If(Previous(cost) = cost,1,0) as flag inline [

          date, store, material, cost

          01/01/2018, a, 123, 10

          02/01/2018, a, 123, 10

          03/01/2018, a, 123, 100

          04/01/2018, a, 123, 10

          05/01/2018, a, 123, 200

          06/01/2018, a, 123, 200

          ];


          Data1:

          Load *,if(Previous(flag)=1 and flag=0,1,flag) as flag1

          Resident Data order by date desc;


          Drop table Data;


          Load store as s, material as m, cost as c,Date(min(date)) as startdate, Date(max(date)) as enddate

          Resident Data1

          group by store,material,cost,flag1;


          Regards,

          Kaushik Solanki

          • Re: reduce table rows with date_from and date_to columns
            omar bensalem

            Try this:

            table:

            Load date(date#(date,'DD/MM/YYYY')) as date, store, material, cost Inline [

            date, store, material, cost

            date, store, material, cost

            01/01/2018, a, 123, 10

            02/01/2018, a, 123, 10

            03/01/2018, a, 123, 100

            04/01/2018, a, 123, 10

            05/01/2018, a, 123, 200

            06/01/2018, a, 123, 200

            07/01/2018, a, 123, 10

            08/01/2018, a, 123, 10

            09/01/2018, a, 123, 10

            ];

             

            NoConcatenate

            TempTable:

            LOAD *,

              If(material = Previous(material),

              If(cost = Previous(cost), Peek('Flag'), RangeSum(Peek('Flag'), 1)), 1) as Flag

            Resident table

            Order By store, material;

             

            Drop Table table;

             

            finalTable:

            NoConcatenate

            LOAD store,

              material,

              cost,

              Flag,

              Date(Min(date)) as MinDate,

              Date(Max(date)) as MaxDate

            Resident TempTable

            Group By store, material, cost,Flag;

             

             

            drop Field Flag;

            drop Table TempTable;


            Result:

            Capture.PNG