Create multiple output files based on a unknown number or keys
I am trying to split one flow into multiple outputs based on a key. there are an unknown number of keys and every row with the same key is going to be outputted to a tFileOutputDelimited corresponding to that key. Any thoughts on how to go about doing this?
Hi,
1) Calculate the total number of distinct records in the input data (distinct key value)
2) Store this value in the variable
3) Use Orchestration component like tFlowIterate, tLoop for looping mechanism
4) As the number of input records are dynamic, use the variable value (stored in 2nd step) for number of Iteration
5) Use Input component --> Filter --> tFileOutputDelimeted
6) Use parameter in the filter & tFileOutputDelimeted for storing different key value in the corresponding file
Best Regards,
Mayur
Thanks for the response, I think I understand the concept you put forward, but I'm not sure how to implement it. 1) how do I store the value in a variable from the flow? 2) How is tFlowToIterate used? I'm new to Talend. Thanks for your help! -- Jeff
Hi Jeff,
For your 1st question:
# You can use tUniqueRow Component to identify the Unique records from input
# Pass these values to tMap component
# In tMap component create one variable (v_RowCount)with Integer as data type and increase its value by one for each iteration
# Once the flow is completed the variable in the tMap (v_RowCount) will consist of number of unique records in the file
# You can use this variable in your further logic
For your 2nd question: For help in any component use the following steps
# Drag & drop this component (any) in the work area
# Simply click F1
# A link will be visible in the Help window
# Click on that click, it will take you to that particular help section which consist of that component specific information
# Below that details it also consist of the case studies which will be helpful to understand how to implement any job by using this component
Try out these things and let me know in case you face any issues.
Best Regards,
Mayur
One little trick that might be helpful here:
almost all components will update a globalMap key <component_name>_NB_LINE with the number of rows the component processes. You can retrieve this value with a call like this (substituting your component name of course):
(Integer)globalMap.get("tOracleInput_1_NB_LINE")
This can be very useful when you want to retrieve the number of rows that has gone through any component.
Hi, I've written a new tutorial on "how to split a file into many files regarding a key on each record" which explains how to solve this kind of task. It is actually only available in french. The tutorial shows 3 different technics to achieve this task. Hope it can be useful.
Here is how to write a file for each row using a unique key.
0) Create or have a existing unique key on each row
1) Read the file to prime the key in a loop(tFLowToIterate)
2) On the second read imbedded in the iterator filter on the iterator key
3) Change the name of the file to use the iterator key and current date time stamp
Hi, I would like to generate multiple files based on year and all these files has to store in their corresponding year folders automatically. Plz find below screen shots for my requirement:
Hi
ashajyothi.ece,
Can you upload again the screenshots you wanted to show, please? For some reason it didn't make it to your post.
Best regards
Sabrina