Skip to main content
Announcements
Introducing Qlik Answers: A plug-and-play, Generative AI powered RAG solution. READ ALL ABOUT IT!
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

Talend job running in loop due to kafka topic? need inputs

Hi,

 

I am using Talend job and passing Kafka message and sending it to the consumers, after the job is completed still my job runs in loops and doesn't end.

Is it due to Kafka message or something? Do we need to close the connection?

Labels (2)
11 Replies
Anonymous
Not applicable
Author

Can you show us a screenshot of your job? It is difficult to identify what might be happening without knowing a little more about the job.

Anonymous
Not applicable
Author

Thanks rhall for your prompt response.

 

I have a scenario lets say, reading from Kafka Topic. I have dropped 6 files in my directory and processing it using talend so all 6 messages are stored in Kafka Topic and after completing the entire 6 files again the job is running for all the 6 files. So it is continuously running in loop.

I have attached the screenshot of the kakfaInput I am using. Is there something we need to do like resetting offset or anything? 

 

Thanks in advance.


kafka issue.PNG
Anonymous
Not applicable
Author

What does the job look like? Are you using a tFileList? Is the job triggered by the files arriving in the TAC? Are the files being moved after you have processed them?

Anonymous
Not applicable
Author

yes. basically I am getting the data from mongodb and storing in kafka topic as a message and passing it.
Yes, I am using tFileList.

 

Attached the flow of the job. Let me know if you need any other details


kafka issue2.PNG
Anonymous
Not applicable
Author

Your job is driven by the Kafka component. It will iterate your tFileList for every message that is returned. So if you have 1000 messages, your tFileList will fire 1000 times. I don't think you have the flow that you want. Can you explain your requirement regarding the files. Are the files supposed to drive this process?

Anonymous
Not applicable
Author

Basically I have got 4 different jobs and in which 2 are running fine but when coming to the 3rd job (

step1: tKafkaConnection and tMongoDBConnection

step2: then taking input from mongodb collection.

step3: onsubjob ok --> taking input from kafka topic and below is the screenshot of the job.

 

and functional thing is::::::: if we have 6 files initially, then after running the job, the 1st 2 jobs runs fine but here 6 files again after completing successfully, it runs in loop. This shouldn't happen. Any idea?

 

Can you help me if I can share my screen?

 

0683p000009M4bO.png

Anonymous
Not applicable
Author

I'm pretty sure this is because your Kafka topic has lots of messages (and probably has messages added while the job is running). Every message retrieved from Kafka will trigger every file (6 of them) to be processed. So the flow is this....

 

1) Kafka message received

2) Process all files

3) Go to 1

 

This is highly likely to keep going, especially if your Kafka topic has many messages coming in. You can set the tKafkaInput to stop after a period of time or number of messages received. This is configured inside the tKafkaInput component.

 

If you want this process to run 6 times (as per the number of files) you need to drive it by the tFileList. If each of the files needs all of the information from messages, you need a separate subjob to consume the data from Kafka before the file processing. You can store it in a tHashOutput component. This way, you will need to set a limit to what the Kafka component consumes.

Anonymous
Not applicable
Author

Thanks @rhall for your time and responding to my issue.

I will definitely try this. But one more thing which I observed today is, the job is running fine in my local machine but wherein when the same code is pushed to Talend Cloud and running there, it is causing this issue.

Any idea on this? isn't it strange?

Anonymous
Not applicable
Author

If you are getting a difference in how the job is running between Studio and the Cloud, you should first check that this is not something connected to running on a different environment. A development environment would normally be used to test jobs in the Studio, whereas the Cloud I would expect most people to have moved to TEST or PROD. Make sure this isn't the cause (maybe there is a different amount of data in your Kafka topic between the environments?). Otherwise, I would take this issue to support. You will have a support account if you are running in the Cloud.