Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Qlik Open Lakehouse is Now Generally Available! Discover the key highlights and partner resources here.
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

How to parse an input round robin into multiple outputs

I am trying to load an input file into Redshift and I want to split the file round robin before loading it into Redshift to make use of the computational power of multiple slices in my cluster. How do I split an input into n number of outputs in a round robin fashion using Talend?

 

Ex:

Input:

id     name

1      Jon

2      Anne

3      Cole

4      Zack

5      Ellen

 

Output:

Main1

1     Jon

4     Zack

 

Main2

2    Anne

5    Ellen

 

Main 3

3    Cole

Labels (2)
3 Replies
cterenzi
Specialist
Specialist

You can create three tMap outputs with the condition: rowX.id % 3 == 0
...1
...2
And send each output to a separate file
Anonymous
Not applicable
Author

Thank you for the reply. I thought about doing that, but I actually need 6 outputs (I put down 3 in my question to simplify the problem). So with this method rowX.id % 3 = 0 and rowX.id % 2 = 0 and rowX.id % 6 = 0 when the id is divisible by 6. I can't think of a simple filter to be able to split it 6 ways.

cterenzi
Specialist
Specialist

You can create six outputs and change the expression to mod 6.

Alternately, I think you can set a row limit on tFileDelimited, and it will split the file into chunks of that size. To get a consistent number of files, you'd need to get a record count and divide that by the number of files you want. I can't test right now, but I'd assume it would use the sort order of the data flow, so that wouldn't get you a round robin of IDs unless you added the modulo expression as a new column and then sorted by that (and secondarily by the id).