Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi
We have implemented a job to pull 53m records with 11 columns to pass through to the tRunDataPrep component in a job. The recipe is pretty simple - nothing more complex than uppercase, removing whitespace, extracting numbers from fields etc.
We are using a joblet to run the execution as we have 11 different sources and need a different schema per recipe and so this offers minimal configuration requirements for each - we have looked at Dynamic Schema but this doesn't work for us given the way we are needing to orchestrate, so this is an approach we are comfortable with.
The job runs fine until we get to 6m records and it times out with a socket error. I've attached the images of the flow and error and would appreciate some thoughts on the resolution. Following some research into JVM timeout parameters I have just added
-Dws_client_connection_timeout=180000
-Dws_client_receive_timeout=180000
now rerunning but would appreciate some thoughts on this issue and how best to resolve it.
Attached is the flow plus the error output.
Hi
I have directed your question to our developers and hope they can get back to you soon.
Regards
Shong
Thanks
I have restructured the job and now extract it into flat files in 2m chunks. This is then read into the next subjob and into the tRunDataPrep. All now works fine and executes as expected.
Would be good to know if there is a way to not have to chunk the data up to get this through so please do provide an update here when available.
Thanks
Dave