Do you have any tips or tutorials, specific components for performing incremental load of tables?
I want to load all the new records from source to target, where target originally does not have those records. And I need to be able to handle errors. 1000 records in source need to be inserted into target, and there might be times when the job fails after inserting 500 records. So if I have to rerun the job, how do I skip those 500 already inserted? The db is oracle.
Overall goal:
Step1: Insert a Record into a table with Seq, Jobname, Jobstartdt, status, Datafrom and Datatill
Step2: Execute the Actual Dimension/Transaction data where changeDt >= DataFrom and is < sysdate
It will have insert and update.
Step3: Execute the first step flow again to update the JobEndDt and Status as complete
The above steps need to complete daily for each Job.
TabA -- (Audit Table)
SeqID, JobName, JobStartdt,JobEndDt, Status, DataFrom,DataTill
1, job1, 20/8/2015 01:30 AM, 20/8/2015 01:50 AM, Complete,18/8/2015, 19/8/2015
2, job1, 21/8/2015 01:30 AM, null, 'Error',19/8/2015,20/8/2015
3, job1, 22/8/2015 01:30 AM, , ,19/8/2015,21/8/2015
TabB --
EmpID, EmpName, DOJ, Address1, Address2, ChangeDt, LoadDt
1,
2
3
Address change may occur for users multiple times, i would need that record loaded into target.
Hi ghudson_js1,
Could you please refer to Talend Change Data Capture (CDC) which identifies and captures data that has been added to, updated in, or removed from database tables and makes this change data available for future use by applications or individuals to see if it is what you are looking for?
TalendHelpCenter:CDC architectural overview.
Best regards
Sabrina