Hi,
I am using Open Studio for Big Data v 6.1.1 and have a pipeline where I am reading data from Kafka and joining it with a small DB and writing the output back to Kafka. The lookup data in DB can change periodically, so my pipeline is:
tLoop -> tJava --(On Component Ok)--> tKafkaInput -> tMap (Lookup using tHiveInput) -> tKafkaOutput
I am using tHiveInput for lookup flow and loading of data in tMap ('Load once' configuration). In tKafkaInput, I have configured option 'Stop after a maximum total duration' setting to 600000 ms i.e. 10 mins. So after every 10 mins., tKafkaInput flow is stopped and started again, with tMap again loading lookup data from Hive.
The problem here is that after one iteration of tLoop is over, in the next iteration tKafkaInput doesn't read any input data. From the logs, I see that consumer id from previous iteration is already present and new iteration consumer is not getting any partition assigned to it.
Can someone please help me here?
Is there any other way I can design my pipeline and achieve the same use case?
Thanks,
Ajay
Hi,
Could you please post your job setting screenshots into forum? What's the java code in tJava component? How did you set
tKafkaInput
component?
More information will be helpful for us to address your issue.
Best regards
Sabrina
For the same consumer group Id I think you need to start reading the offset storage from "the latest". Also i would suggest you separate all components after the tjava in a subjob