4 Replies Latest reply: Mar 7, 2016 11:07 AM by Rui Esteves RSS

    Data cleansing - reduce multiple inconsistent data to unique records

    Rui Esteves

      Hi

      I am struggling to load some data which is not consistent and I would appreciate your help.

       

      I have a database with 4 fields: id1, id2, description, and branch

      The identification of the customer is given by the concatenation of the fields id1 and id2. So, when loading the data, I load it as:

      id1&id2 as identification

       

      Unfortunately, for the same identification (id1&id2), each branch registers the customer's name in a subjective way. Some use abbreviations, other use partial name, etc.

       

      As I want to have a resulting table with 2 fields: identification, description (only one of the names. any of the used names suits me, so I could peek the first one or the last one for the same identification, for example), how should I load the data?

       

      I already tried lots of options, but without any success.

      for example, using the ApplyMap would require to maintain a conversion map and that is not practical.

       

      Any ideas?

       

      Thank you