Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi, I'm trying to design a job with several subjobs.
The idea is to separate a big transformation in several readable steps.
I have a parent job that call my first child job. This job end with a tBufferedOutput (is it the right output?)
Then in my parent job, with a main connexion, I direct the flow from my first to my second child job.
The question is, how can I retrieve my rows in my second child job?
I tried with a buffered input, but no success.
I attached some screenshot to make this more understandable.
Thank you for your help!
Make sure schema of tBufferOutput in child job is matches with schema in tRunJob in parent job. Based on screen shots - it appears your job is running but not showing output.
Yes it's the same schema I used the same metadatas.
I joined new screenshots.
Also, it seems that the main connecion between the job is not good, as it seems the second child is call once for each row from the first child!
I tried to use OnSubJobOk but no success either...
I join another screenshot for it
Your child job should end with tBufferOutput. That will allow parent job to read output from it. See the screen shot attached.
Yes I made this work, but my problem is when I try to reinject the rows in another child job
The first child job should end with tbufferoutput. Then the schema of tbufferoutput should match with schema of trun job of 1st child job in parent job. Then use tjavarow between 1st child job and 2nd child job and pass the data from one child job to other by generating the code or the values can be stored in globalMap variable using globalMap.put method and retrieve the values by passing them in context parameters in 2nd job
Using the context seems a bit tedious but I will try.
So far I solved my issue by storing my rows in a file at the end of child1 then using that file to continue the treatment in child 2, but I guess this is not good practice and bad for the performances.
May be I can store the result in a hash map type in-memory database. Using the tBuffer input and output seemed to be the most simple way.
What do best practices advise for this kind of case?
Or may be the way I split my transformation is not right in the first place? Else everyone would have this kind of issue for any transformation with more than a few steps.
This feel like very standard stuff, yet I can't find anything but taht convoluted context solution.
How do you guys do it?
I have a file, I want to make several transformations on it and separate them in several jobs.
The goal is to have a parent job calling each successive children jobs in sequence. First child job should read the initial file, last child should write the result.
I can get the flow out of the child with tBufferOut, and get it in the parent with tBufferIn, but I can never reinject it in the next child without creating a huge list of context values with every single columns in it? Why couldn't I get the flow in the next child using a simple tBufferIn, tuo get the tBufferOut of the previous child?
Thanks a lot for your help. Let me know if you have another way of splitting a big job in several subjob, but using the salme flow of data during the whole process.
Use the tHashInput and tHashOutput instead. See: