2 Replies Latest reply: Sep 29, 2015 9:26 AM by Digvijay Singh RSS

    duplicate removal

    vir vir

      Hi,

       

      we have some data  like below;

       

       

      ID  Email

      1   abc.email.com

      2   abc.email.com

      3    xyz.email.com

       

       

      data having duplicate emailids, that i want to remove....kindly suggest the possible ways in script.

        • Re: duplicate removal
          Manish Kachhia

          temp:

          Load * Inline

          [

            ID,  Email

            1,   abc.email.com

            2,   abc.email.com

            3,   xyz.email.com

          ];

           

           

          Left Join(temp)

          Load ID, Email, if(Email=Previous(Email),1,0) as Flag Resident temp

          order by ID;

           

           

          NoConcatenate

          Final:

          Load ID, Email Resident temp

          Where Flag = 0;

           

           

          Drop Table temp;

          • Re: duplicate removal
            Digvijay Singh

            table:

            load * inline [

            ID,  Email

            1,   abc.email.com

            2,   abc.email.com

            3 ,   xyz.email.com ];

             

             

            NoConcatenate

            Load min(ID) as ID, Email resident table group by Email;

             

             

            drop table table;