Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 
zagzebski
Creator
Creator

Incremental load based on a date and flag


I have an incremental load set up in my script where I add the new months data when it becomes available.

However I have (2) issues:...

1. I need something in my script that says run the new data when it becomes available (so for example on December 1 November's data is available so do the incremental load - but it is not always available on the 1st of the month)

2. We rerun our script every night so once #1 (above) runs the new months data I don't want to run it again until the next months data becomes available (December).

I would like to do this in an automated fashion so the users never have to touch the script...

Any ideas?

Thanks in advance,

Steve

1 Solution

Accepted Solutions
Anonymous
Not applicable

dathu.qv is right.  The key is doing this based on a date comparison.  Your nightly job still runs, but it only runs through to complete script if the date condition is met.

I do this with loads from Salesforce.  See the attached script example.  Each row of the Salesforce.com dataset has a date stamp indicating when it was created/inserted.  In a nutshell my nightly script:

1.  Checks the existing QV table - finds the MAX(QV_CreateDate)

2.  Checks the live Salesforce.com table - finds the MAX(SFDC_CreateDate)

3.  Compares them to each other and creates a dynamic SQL Where clause to only pull the rows where MAX(QVCreateDate) < MAX(SFDC_CreateDate)

For your purposes, you can modify #3 to simply check today's date, instead of checking the DB directly.

Note:  the attached script is a little more elaborate, but should help you.

Regards,

John

View solution in original post

6 Replies
Anonymous
Not applicable

Hope i have understood your issue.

First, How do you identify if the November data is available or not. Is there a field value that you check, if so. Then you can set a variable or something, lets say if the first data check completes and data is available then set the variable to "1" else to "0". And the remaining part of the script will only run if the variable value is one else it will not. This way even if you schedule the job to run every night, the job would complete it just few seconds if the data is not there and if its there it would extract the data.

zagzebski
Creator
Creator
Author


Thanks for the response.

* The November data SHOULD be available on the 1st of December. However I can't count on that because it might be on the 2nd or 3rd of December.

I am still confused how to (1) identify when the new months data is ready and (2) how do you set the variable to "1" or "0" - I have never done that before.

Anonymous
Not applicable

The way i see it.

I assume - you are extracting data from a DB.

If Yes.

First time the extract runs, You can take a Count(Transaction) and store that to a variable.Lets say something like below.

Let CountofTransactions = NoOfRows('DataTable');

Now when the second time the extarct runs, first tale a count of records of that table.

Check:

SQL SELECT COUNT(1)  as COUNTCHECK FROM DB.TABLE;

LET DBTableRowCount = Peek('COUNTCHECK', 0, 'Check');

IF $(DBTableRowCount) > $(CountofTransactions) then

Your Extract Script.

After the extracts again count the records in the updated table and update this variable.

Let CountofTransactions = NoOfRows('DataTable');

ELSE

SET ExtractCheck = "There is no new data available to Extract";

ENDIF

hope this helps

Not applicable

Create the Materialized View in you DB, and this having only MonthName & NooFRows field.


Before loading the Main table, Load above Materialized view and validate the Dec Month having rows or not.


If rows exist continue loading the data otherwise exist the script.

Anonymous
Not applicable

dathu.qv is right.  The key is doing this based on a date comparison.  Your nightly job still runs, but it only runs through to complete script if the date condition is met.

I do this with loads from Salesforce.  See the attached script example.  Each row of the Salesforce.com dataset has a date stamp indicating when it was created/inserted.  In a nutshell my nightly script:

1.  Checks the existing QV table - finds the MAX(QV_CreateDate)

2.  Checks the live Salesforce.com table - finds the MAX(SFDC_CreateDate)

3.  Compares them to each other and creates a dynamic SQL Where clause to only pull the rows where MAX(QVCreateDate) < MAX(SFDC_CreateDate)

For your purposes, you can modify #3 to simply check today's date, instead of checking the DB directly.

Note:  the attached script is a little more elaborate, but should help you.

Regards,

John

zagzebski
Creator
Creator
Author

Thanks John!