3 Replies Latest reply: Feb 21, 2017 3:27 PM by Aar Kay RSS

    Is this possible?

    Mark Ritter

      I have a QVD with 2M rows.  This QVD gets loaded by concatenating 280 excel files.  This is employee data from our HR system.  The same employee_id could be in this QVD 280 times.  So I want to load distinct employee_id so I don't have duplicates.  But I want to load the field values from the last row for this employee. 

      Every day there will be a new excel file to load.  So I can't hard code anything.

       

      I tried to add max() to the fields thinking I would get the last one.  But got an invalid expression error when loading.

       

      Here is the script without any logic to get the last row for each employee.  What can I add to this logic to achieve what I want to do.

       

      Load Distinct Employee_ID,

         Full_Legal_Name,

         Hire_Date_Most_Recent,

         CurrentActive

      FROM [lib://DEV/QVDWDay\WD_EmployeeMasterAll.QVD] (qvd);

       

      So again I would like to have one row for each employee with the latest values for each of the fields.  There are more fields than these.  But if I can get what I want to work for these I can do the rest of them.

        • Re: Is this possible?
          Robert Hutchings

          Hi

           

          I assume you want the latest Hire_Date_Most_Recent?

           

          If so one way is to use

           

          >>> order by Hire_Date_Most_Recent  desc  (after initial load. It can only be done using residential load)

          and >>> where not exists (Distinct Employee_ID)

           

          of course you would need to rename Distinct Employee_ID (where not exists). Or restrict just to the second table

           

          something like

           

          TEMPLOAD:

          Load

          Employee_ID,

             Full_Legal_Name,

             Hire_Date_Most_Recent,

             CurrentActive

          FROM [lib://DEV/QVDWDay\WD_EmployeeMasterAll.QVD] (qvd);

           

          DATA:

          LOAD

          *

          where not exists (EMPLOIDTEMP) ;

          Load

          Employee_ID,

          Employee_ID AS  EMPLOIDTEMP,

             Full_Legal_Name,

             Hire_Date_Most_Recent,

             CurrentActive

          RESIDENT TEMPLOAD

          order by Hire_Date_Most_Recent  desc ;

           

          DROP TABLE TEMPLOAD ;

          DROP FIELD EMPLOIDTEMP ;

          • Re: Is this possible?
            Michael Solomovich

            This should help you:

             

            TABLE:

            Load Distinct Employee_ID,

               Full_Legal_Name,

               Hire_Date_Most_Recent,

               CurrentActive

            FROM [lib://DEV/QVDWDay\WD_EmployeeMasterAll.QVD] (qvd);

             

             

            INNER JOIN (TABLE) LOAD

              Employee_ID,

              max(Hire_Date_Most_Recent) as Hire_Date_Most_Recent

            RESIDENT TABLE

            GROUP BY Employee_ID;

             

             

            Probably the Hire_Date_Most_Recent is not the field you want to use, I just use the only date field I see here.  Hope you'll figure out what field you need.

            • Re: Is this possible?
              Aar Kay

              Yes,

              Instead of concatenating the new file to the old one, First load the new file then concatenate the old one with Where exists Clause - Where Not exists (employee).

              Eg:

              Load Distinct Employee_ID,

                 Full_Legal_Name,

                 Hire_Date_Most_Recent,

                 CurrentActive

              FROM

                   NewFile;


              Concatenate

                 Full_Legal_Name,

                 Hire_Date_Most_Recent,

                 CurrentActiv

              FROM

                   OldFile

              Where

                   Not Exists(Full_Legal_Name);

               

              or

               

              If you want to keep the employee latest data from the final  QVD then there should be a date field to sort out and keep the latest records