2 Replies Latest reply: Jan 16, 2013 3:55 PM by James Rowley RSS

    Cleansing Records and AutoNumber (QV Expressor)

      A couple of questions;

       

      1. Is there a built in function that can add an Autonumber of key to records that are read in  (from any source)?

       

      • My thought is that I would need to write an expression in a transform to create the value as the records are processed?

       

      2. When processing records for cleansing I would like to find an effecient way of replacing values in a data set, i.e I have a spreadsheet that contains 100+ attributes and some of the records for each attribute can contain a value (x) which I would liek to replace globallyas the file is read in or processed?

       

      • I can see that I could just write an expression function in a transform for each attribute to convert the value, however this will be a bit tedious, would another form of script work, ie. datascript to load all values into a datascript table, then do some sort of globaly find/replace?

       

      Thoughts or suggestions appreciated?

       

      Thanks

      James

        • Re: Cleansing Records and AutoNumber (QV Expressor)

          For #1, will the utility.sequence function meet your needs?  If so, your code will need to retain each value and after processing all the records save the next value using the utility.store_integer function in the finalize function. Then next time the data flow runs, pick up this value using the utility.retrieve_integer function in the initialize function so the code can set the starting point for utility.sequence.

           

          For #2, in the Transform operator's function rule transform function, the argument input is already a string indexed table, so you can use the pairs iterator function to view each attribute's value.

           

          function transform(input)

          output={}

          for k, v in pairs(input) do

          if v==foo then input[k]=bar end

          end

          output=input

          return output

          end