Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Join us in NYC Sept 4th for Qlik's AI Reality Tour! Register Now
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

Real time Change Data Capture in MySQL

I am trying to setup Real time change data capture between two different MySQL databases using Talend Studio.

 

I was able to successfully create a job that uses Publish/Subscribe model that picks up only the changed data from source and populates in the target database.

 

I could not find the documentation to setup CDC in real time i.e. as soon as a new row is inserted in the source database it will be picked up by the job and populated in target database. The Talend job will be running continuously to look for possible changes in the source.

 

My question: is scheduling the Talend job using some scheduler for desired interval the only option in this case? What are the options available in Talend Studio to achieve this? 

 

Thanks in advance.

Labels (4)
6 Replies
Anonymous
Not applicable
Author

If you are looking for a real time solution for this, you may want to use the ESB. Essentially the process would remain the same, but you would have a Talend (Apache Camel) route monitoring your changes. When a change occurs, the route would trigger your job to update the target DB.  

Anonymous
Not applicable
Author

Hi,

 

Thanks for your reply. I have designed a job for which source is the tMySqlCDC component. This component keeps track of the changes since last execution of the job. So essentialy it is Capturing the change data. What is missing in this piece is that I have to run this job for the changes to be reflected in the target database. How do I modify this job such that it continuously keeps on looking for changes in the source database i.e. once you start the job it keeps running and keeps the source and target database in sync. 

 

Thanks once again. 

 

 

Anonymous
Not applicable
Author

Data integration jobs are batch; they start and end. What you need to do is use a Talend (Camel) route. A route will remain always on and can monitor a database folder for changes. This will require Talend ESB. You will not be able to do what you require with just a job.

Irshad1
Contributor II
Contributor II

You can make a cron Job in TAC and schedule to run it every 15 mins. In this way you don't have to worry about the job triggering also. once the job is triggered it will pull the change data into your space every 15 mins. This would be a near to real-time CDC. you can also change the cdc job triggering evry 5 mins depending on the average job completion time.

 

Thanks!

Irshad

vapukov
Master II
Master II

 

 

 

Talend CDC is not real time

 

You can look for:

http://maxwells-daemon.io

https://mariadb.com/resources/blog/real-time-data-streaming-kafka-maxscale-cdc

ApacheNiFi

Streamsets DataCollector

 

You can push data from CDC to Kafka, and than parse Kafka topic with Talend, this is work


All based on native replication protocol and work without overloading of server by triggers

 

Anonymous
Not applicable
Author

Say I have Talend ESB ready. I would imagine the solution would be there is a c-component connect to a c-TalendJob which consists of the CDC job.

May I know which c-component to be use?

 

Thanks.