7 Replies Latest reply: Feb 13, 2018 11:22 AM by Noelle Baker RSS

    Conditional elimination of duplicates in script

    Noelle Baker

      Hello,

       

      I have two columns of concern in a qvw file: VIN and Trim Level.  In some cases, Trim Level is listed as 'Unknown'.  There are two possibilities for these cases: 1. VIN is unique or 2. VIN is duplicated on another row, with the other row showing the correct Trim Level (e.g. AA in example below).

       

      In case 1. I want to keep these unique VIN's (even though Trim is 'Unknown').

      In case 2. I want to eliminate the rows with 'Unknown' and keep the rows with the correct Trim Level.

       

      In all cases of duplicates, all other columns (not discussed here) are the same.

       

      Ex.

      VIN        TRIM            Want

      X1          AA                Keep

      X1          Unknown      Eliminate (replace by above)

      X2          BB                Keep

      X3          Unknown      Keep

       

      Ideally, I would like to accomplish this in the Script so the effect cascades to all other sheet elements.

       

      Thanks!

        • Re: Conditional elimination of duplicates in script
          m w

          Order your table by VIN and then by TRIM

           

          Load

               *,

               if(VIN)=peek('VIN') and TRIM = 'Unknown','Eliminate','Keep') as Want

          resident your table

           

          Then resident load where Want = 'Keep'.

            • Re: Conditional elimination of duplicates in script
              Noelle Baker

              Thank you very much for the response.  Perhaps I need to clarify... the 'Want' column listed above is not an additional desired column, but instead indicates the outcome I would like to see in each scenario.

               

              I want to keep every instance where VIN is not duplicated, and I want to keep only one instance when VIN is duplicated (in this case, I want to keep the instance where the Trim Level is defined (i.e. not 'Unknown')).

               

              Could you please offer another suggestion?

               

              Thank you.

                • Re: Conditional elimination of duplicates in script
                  m w

                  The Want column is what allows you to identify the records that you want to keep.

                   

                  After creating the Want column, you do a resident load on that table where Want = 'Keep'

                  Then you can drop the Want Field.

                   

                  See the attached qvw.

                    • Re: Conditional elimination of duplicates in script
                      Noelle Baker

                      Hello m w,

                       

                      Thank you for the additional response.  I understand what you are suggesting with the Resident tables.  Unfortunately, I am not able to get the script to fully run, with the following errors:

                       

                      Table not found

                      VINSort:

                      ...

                       

                      Table not found

                      Drop table statement

                      ...

                       

                      Table not found

                      EliminateDups:

                       

                      Table not found

                      Drop table statement

                       

                       

                      Here is the script I'm using:

                      Initial:
                      LOAD vin,
                      trim_level,
                      (all my other variables)

                      FROM
                      [C:\Users\... .qvx] (
                      qvx);

                       

                      VINSort:
                      NoConcatenate
                      LOAD
                      vin,
                      trim_level,
                      (all my other variables),
                      if(vin=peek('vin') and trim_level = 'Unknown','Eliminate','Keep') as Want
                      resident EliminateDups;
                      drop table EliminateDups;

                      EliminateDups:
                      NoConcatenate
                      LOAD  vin,
                      trim_level
                      Resident VINSort
                      where Want = 'Keep';
                      DROP table VINSort;

                      exit script;

                       

                      Are you able to spot my error?

                       

                      Thanks again very much for your time.

                       

                        • Re: Conditional elimination of duplicates in script
                          Sunny Talwar

                          Shouldn't this be Resident from Initial, rather than from ElminateDups

                           

                          Initial:
                          LOAD vin,
                               trim_level,
                               (all my other variables)
                          FROM [C:\Users\... .qvx] (qvx);

                           

                          VINSort:
                          NoConcatenate
                          LOAD vin,
                               trim_level,
                               (all my other variables),
                               if(vin=peek('vin') and trim_level = 'Unknown','Eliminate','Keep') as Want
                          Resident EliminateDups Initial;
                          DROP Table EliminateDups Initial;

                           

                          EliminateDups:
                          NoConcatenate
                          LOAD  vin,
                               trim_level
                          Resident VINSort
                          Where Want = 'Keep';


                          DROP Table VINSort;

                           

                          EXIT Script;

                            • Re: Conditional elimination of duplicates in script
                              Noelle Baker

                              Thank you all for the responses.  Using the base code from m w and the tip from Sunny, I was close.

                               

                              Since my trim_level could be after the 'Unknown' alphebetically, (e.g. trim_level = XL), I needed to reassign the name of 'Unknown' as 'ZZZZ' in order to work properly with the 'peek' function.  This seemed to do the trick.

                               

                              Thanks again - appreciate the kind and patient answers to a new user question.

                              Noelle

                    • Re: Conditional elimination of duplicates in script
                      Ruben Marin

                      Hi Noelle, another option with Exists:

                       

                      // Load TRIM<>'Unknown'

                      Final:

                      LOAD *,

                           VIN as chkVIN

                      Resident Table

                      Where TRIM<>'Unknown' and not exists('chkVIN',VIN);


                      // Add records that only has TRIM='Unknown'

                      Concatenate (Final)

                      LOAD *,

                           VIN as chkVIN

                      Resident Table

                      Where TRIM='Unknown' and not exists('chkVIN', VIN);

                       

                      DROP Field chkVIN;