Transpose Column names

Anonymous · ‎2017-09-18

Hi,

I am new to Talend. Could any one please help for the below use case. How to perform this use case. I

Input:

id : 01

name:aaaa

age:29

id:02

name:bbbb

age:25

id:03

name:cccc

age:30

Output:

ID;NAME;AGE

01;AAAA:29

02;BBBB:25

03;CCCC:30

Anonymous · ‎2017-09-18

I assume the data you have shown is the format you would expect it to arrive in? If it is, then the main body of what is needed is covered in this tutorial (https://www.rilhia.com/tutorials/dynamic-column-order).

Essentially, with every row you are getting two values; the column header and value. Read these values in from your file as column header and value. Then use tMap variables (as I have done in the tutorial) to retrieve the correct value for each output column. Since the tMap variables hold their values between rows, your first row will just have your ID, the second value will have the ID and the name, the third row will have the ID, name and age. Since we know this, you can use a tAggregateRow after the tMap to group this data by ID. We know that only the last row per ID will hold all of the data. Therefore your tAggregateRow will group by ID and return the LAST value for the name and age. If configured correctly, you will end up with single rows for each of your records.

Anonymous · ‎2017-09-19

HI @rhall.

Thanks for your reply. This is not dynamic column issue. Column names are coming as rows with values with (: ) separator.

Like wise I have millions of records.

Here ID, Name and age are column names which are in row wise with their values separated by : and want to transpose the column names as headers and below the relevant values of particular column.

first record:

ID: 100

Name: AAA

Age:30

Second Record

ID: 101

Name:BBB

Age : 32

Third Record

ID: 102

Name:CCC

Age : 26

Output:

ID; Name; Age

100;AAA;30

101;BBB;32

102;CCC;26

Regards,

Anonymous · ‎2017-09-19

Try out what I suggested, it will work. The tutorial merely shows the kind of process you need, I followed that up with a description of what you would need to do in order to solve this scenario.

shivanand · ‎2017-09-20

Hi ,

This is like normalization in informatica which is not available directly in talend.

So here is turnaround

Use tmemorizerows to store last three row values as values are repeatating after 3 rows

Then in tjavaFlex get every 3rd value from tmemorize with following code in main

if(i%3==0)

{

row3.Column0=(((String[]) globalMap.get("tMemorizeRows_1_Column1"))[2]);

row3.Column1=(((String[]) globalMap.get("tMemorizeRows_1_Column1"))[1]);

row3.Column2=(((String[]) globalMap.get("tMemorizeRows_1_Column1"))[0]);

}

else

{

row3.Column0= null;

row3.Column1= null;

row3.Column2= null;

};

i++;

Declare variable in start code part as in picture

As there will be null rows for 1st and 2nd rows from source filter those in tfilterrow

ashif2 · ‎2017-09-20

I assumed we always have these key value pairs in the same order : (id,name,age). without any skip in the keys. (i.e. there wont be any id without name and age ).

With this assumption, i written in a simple logic to solve this issue.

Go through it and revert me back. I have not used any in memory components like thashoutput/ tbufferedoutput( instead i using temp file approach)

ttranspose.zip

Anonymous · ‎2017-09-20

I'm a little confused about the increasing complexity and memory usage of the offered solutions. If we can assume that the record sets will appear in order (ie set 1 after set 2... forgetting about column header order), we can do this in 4 components; an input component, a tmap, a tAggregateRow and an output component. If you need to sort the order, then 5 components (a tSort after the input). There is no need for complicated java or to memorize anything but the last calculated row (which can be handled by the tMap).

ashif2 · ‎2017-09-21

You are right . I made the code complicated. Actually I thought of using variables of tmap but not proceeded that way.

Thanks for your valuable suggestion .

TransPose_Made_Easy.JPG

Talend Data Integration

v6.x