Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi Team,
What are the ways to handle errors in Talend BigData Batch job?
Some of the bigdata components do not have "OnSubJobError" for eg. tHiveInput. How should we enable error handling in spark?
Hello,
The onSubjobError connector is not supported by most Spark components.
Best regards
Sabrina
Hello,
What's kind of error do you want to handle? To capture an exception in Spark jobs?
Best regards
Sabrina
Hi Sabrina,
Lets say, I am performing below operation.
tHiveInput --> tMap (Some Transformation) --> tHiveOutput
Even though there is some issue in transformation, YARN application gets completed successfully. I am not able to record/track any issue, unless I go and check the logs.
Hi Sabrina,
Lets say, I am performing below operation.
tHiveInput --> tMap (Some Transformation) --> tHiveOutput
Even though there is some issue in transformation, YARN application gets completed successfully. I am not able to record/track any issue, unless I go and check the logs.
Hi Guys,
We are facing the similar situation building big data batch, did you know the resolution / best practices around handling errors in big data batch ?
Thanks,
KP
Below is the response from Talend on the same question.
"You wouldn't find components such as tlogcatcher, tstatcatcher in BD jobs. It is always recommended to have a DI orchestrator job for your spark jobs and then trigger spark jobs from DI jobs. You can pass context of statistics, error and other information using the orchestrator DI job to BD job. Orchestration should always happen through DI, the purpose of BD jobs is to do the processing of huge data. https://www.talend.com/blog/2016/10/05/talend-job-design-patterns-best-practices-part-3/ Error handling is explained in detail in the above blog."
cheers !!
KP