2 Replies Latest reply: May 27, 2014 5:44 AM by Erik Furlanis RSS

    remove similar but not identical data from qvx

      Dear everyone,

       

      I am not sure anymore how I ended in this situation, but here's the issue.

       

      I have a qvx to which I am appending data every week.

      But for one week I have double entries; these are not mirrored but rather similar to each other.

       

      Example here below:

       

      Timestampnum(timestamp)CustomerDataWhat to do
      07-04-14 7:02:52 PM41.736,79365740740500A10KEEP
      07-04-14 7:02:52 PM41.736,79365740740500B20KEEP
      07-04-14 7:02:52 PM41.736,79365740740500C15KEEP
      07-04-14 7:02:52 PM41.736,79365740740500D

      13

      KEEP
      07-04-14 7:02:52 PM

      41.736,79365740739700

      A11REMOVE

       

      Basically the timestamp look (almost) the same.

       

      I tried to exclude data with

       

      LOAD *
      WHERE NOT WildMatch(num(timestamp),'41736.7936574073*');
      

      with no success

       

      For the data I want to keep there is a full dataset.

      The data I want to remove, just the first customer has been stored and data is somewhat similar but not identical to the data I want to keep.

       

      Amount of data is 500k records (under customer A only) and 20 fields or so.

      How can I proceed to remove that second part of data?

       

      many many thanks