Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Write Table now available in Qlik Cloud Analytics: Read Blog
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

[resolved] Split a single row to multiple rows (CSV File).

Hy everyone!.
I'm using talend studio 3.2.0 with WinXP SP3. I have a csv file that has following fields in it.
Timestamp, param1,param2,param3,param1,param2,param3,param1,param2,param3
2009-08-15,1,2,3,4,5,6,7,8,9
Basically i want to seperate each record into 3 different records like this.
Timestamp,param1,param2,param3
2009-08-15,1,2,3
2009-08-15,4,5,6
2009-08-15,7,8,9
Can this be achieved using talend etl tool? if yes what components should be used. I'm entirely new to talend and considering it as an alternative to Pentaho kettle.
Labels (3)
1 Solution

Accepted Solutions
Anonymous
Not applicable
Author

Hello
Here is the demo:
in.csv:

Timestamp,param1,param2,param3,param1,param2,param3,param1,param2,param3
2009-08-15,1,2,3,4,5,6,7,8,9
2009-09-12,A,B,C,D,E,F,G,H,I

code on tJavaRow:
output_row.timestamp = input_row.timestamp;
output_row.conn= input_row.p1+","+input_row.p2+","+input_row.p3+"@"+input_row.p4+","+input_row.p5+","+input_row.p6+"@"+input_row.p7+","+input_row.p8+","+input_row.p9;

Result:
Starting job forum8666 at 16:31 28/10/2009.
connecting to socket on port 3947
connected
.----------+------+------+------.
| tLogRow_1 |
|=---------+------+------+-----=|
|timestamp |param1|param2|param3|
|=---------+------+------+-----=|
|2009-08-15|1 |2 |3 |
|2009-08-15|4 |5 |6 |
|2009-08-15|7 |8 |9 |
|2009-09-12|A |B |C |
|2009-09-12|D |E |F |
|2009-09-12|G |H |I |
'----------+------+------+------'
disconnected
Job forum8666 ended at 16:31 28/10/2009.

Best regards

shong

View solution in original post

5 Replies
Anonymous
Not applicable
Author

Hello guy
Yes, it is easy to finish it with talend etl tool. I will show you a demo soon...
Best regards

shong
Anonymous
Not applicable
Author

Hello
Here is the demo:
in.csv:

Timestamp,param1,param2,param3,param1,param2,param3,param1,param2,param3
2009-08-15,1,2,3,4,5,6,7,8,9
2009-09-12,A,B,C,D,E,F,G,H,I

code on tJavaRow:
output_row.timestamp = input_row.timestamp;
output_row.conn= input_row.p1+","+input_row.p2+","+input_row.p3+"@"+input_row.p4+","+input_row.p5+","+input_row.p6+"@"+input_row.p7+","+input_row.p8+","+input_row.p9;

Result:
Starting job forum8666 at 16:31 28/10/2009.
connecting to socket on port 3947
connected
.----------+------+------+------.
| tLogRow_1 |
|=---------+------+------+-----=|
|timestamp |param1|param2|param3|
|=---------+------+------+-----=|
|2009-08-15|1 |2 |3 |
|2009-08-15|4 |5 |6 |
|2009-08-15|7 |8 |9 |
|2009-09-12|A |B |C |
|2009-09-12|D |E |F |
|2009-09-12|G |H |I |
'----------+------+------+------'
disconnected
Job forum8666 ended at 16:31 28/10/2009.

Best regards

shong
Anonymous
Not applicable
Author

hey that's freakin GREAT man, I liked the '@' delimeter approach you have used.
Thanks for taking out time and mentioning it in so much detail. This tool really provides endless possibilities in the ETL domain. One just needs to use the building blocks efficiently.
Best Regards,
Umar
Anonymous
Not applicable
Author

The other option (which I used for a file) is to create three separate outputs with a tMap and then merge them with a tUnite. Doesn't require hand coding.
alevy
Specialist
Specialist

Rather than using tUnite, you can unite them directly in the tMap: when creating the second and third outputs within tMap just select the "Create join table from" option rather than the default "New output" option. There will then be only one flow out of the tMap but it will include the rows from both of the output tables within tMap.
However, from v4.2.0, the tSplitRow component handles the original requirement very well.