6 Replies Latest reply: Nov 15, 2013 12:49 PM by Steve Zagzebski RSS

    Incremental load based on a date and flag

    Steve Zagzebski


      I have an incremental load set up in my script where I add the new months data when it becomes available.

       

      However I have (2) issues:...

       

      1. I need something in my script that says run the new data when it becomes available (so for example on December 1 November's data is available so do the incremental load - but it is not always available on the 1st of the month)

      2. We rerun our script every night so once #1 (above) runs the new months data I don't want to run it again until the next months data becomes available (December).

       

      I would like to do this in an automated fashion so the users never have to touch the script...

       

      Any ideas?

       

      Thanks in advance,

       

      Steve

        • Re: Incremental load based on a date and flag
          Aadil M

          Hope i have understood your issue.

           

          First, How do you identify if the November data is available or not. Is there a field value that you check, if so. Then you can set a variable or something, lets say if the first data check completes and data is available then set the variable to "1" else to "0". And the remaining part of the script will only run if the variable value is one else it will not. This way even if you schedule the job to run every night, the job would complete it just few seconds if the data is not there and if its there it would extract the data.

            • Re: Incremental load based on a date and flag
              Steve Zagzebski


              Thanks for the response.

               

              * The November data SHOULD be available on the 1st of December. However I can't count on that because it might be on the 2nd or 3rd of December.

               

              I am still confused how to (1) identify when the new months data is ready and (2) how do you set the variable to "1" or "0" - I have never done that before.

                • Re: Incremental load based on a date and flag
                  Aadil M

                  The way i see it.

                   

                  I assume - you are extracting data from a DB.

                   

                  If Yes.

                   

                  First time the extract runs, You can take a Count(Transaction) and store that to a variable.Lets say something like below.

                   

                  Let CountofTransactions = NoOfRows('DataTable');

                   

                  Now when the second time the extarct runs, first tale a count of records of that table.

                   

                  Check:

                  SQL SELECT COUNT(1)  as COUNTCHECK FROM DB.TABLE;

                   

                  LET DBTableRowCount = Peek('COUNTCHECK', 0, 'Check');

                   

                  IF $(DBTableRowCount) > $(CountofTransactions) then

                   

                  Your Extract Script.

                   

                  After the extracts again count the records in the updated table and update this variable.

                   

                  Let CountofTransactions = NoOfRows('DataTable');

                   

                  ELSE

                   

                  SET ExtractCheck = "There is no new data available to Extract";

                   

                  ENDIF

                   

                   

                  hope this helps

              • Re: Incremental load based on a date and flag
                Srikanth P

                Create the Materialized View in you DB, and this having only MonthName & NooFRows field.


                Before loading the Main table, Load above Materialized view and validate the Dec Month having rows or not.


                If rows exist continue loading the data otherwise exist the script.

                • Re: Incremental load based on a date and flag
                  John McTurnan

                  dathu.qv is right.  The key is doing this based on a date comparison.  Your nightly job still runs, but it only runs through to complete script if the date condition is met.

                   

                  I do this with loads from Salesforce.  See the attached script example.  Each row of the Salesforce.com dataset has a date stamp indicating when it was created/inserted.  In a nutshell my nightly script:

                   

                  1.  Checks the existing QV table - finds the MAX(QV_CreateDate)

                  2.  Checks the live Salesforce.com table - finds the MAX(SFDC_CreateDate)

                  3.  Compares them to each other and creates a dynamic SQL Where clause to only pull the rows where MAX(QVCreateDate) < MAX(SFDC_CreateDate)

                   

                  For your purposes, you can modify #3 to simply check today's date, instead of checking the DB directly.

                   

                  Note:  the attached script is a little more elaborate, but should help you.

                   

                  Regards,

                  John