Hi
I've got this kind of data table (Excel or csv):
name joe peter Kelly
surname smith warton hartford
age 23 28 34
occupation professor bartender manager
and I want to use the first column as names for my table fields like this:
name surname age occupation
joe smith 23 professor
peter warton 28 bartender
kelly hartford 34 manager
I've used several components but to no avail. Take into account I don't need aggregation whatsoever, just transpose and convert rows to columns using first column as header.
Hi Sabrina
I'm testing my ETL with a limited set of data. (see output before and after TurnRow)
Custom component seems to work fine but......(there's always a "but") it only processed 4 columns.
Taking into account that my first colum (the future column names) has 14 records, how can I force tTurnRow to process all data?
Thanks for being so helpful
Hi in addition to this issue, the real production input file may have thousand columns (see that every column is a record for each field) Maybe I'm doing it all wrong from the beginning.....
Whether your columns in each block are varying? Whether the order of the column is changing? If the answer is Yes for above, then how do you plan to arrange the data in the output? Any logical or business suggestion? Vaibhav
I feel like this is a complete hack, but it works if you're ok hard coding the column names. I couldn't figure out how to make it more dynamic.
tFileInputDelimited -> tJavaFlex -> tMap
tFileInputDelimited schema only has 1 column... Column0. It's set to read the file for line breaks, but doesn't parse any of the data in each row.
tJavaFlex splits the column into a key/value pair and assigns them to variables. It then compares the Key to each column name and if it's a match, assigns the value to that column.
Because it's a tJavaFlex, once a value is assigned to a column, it stays assigned through subsequent rows until it is replaced. So when it finds column1 in the first row, it keeps that value all the way through row15. You get a growing table as it finds more values. 15 lines in your original file means 15 rows of output, but only the last row has all the rows filled in.
tMap has a filter so it only shows rows where Column14 is not null.
Like I said, it's a hack and all of your key names are hard coded into the code. But it accomplishes what you wanted to do. You could then loop through all your files or if this is all 1 big file, you can do something at the top of your Main code that looks to see if Column14 != null and if that's true, reassign null to all the column rows to empty them and start over.
Main code part of tJavaFlex
if (row1.Column0.indexOf("=")!=-1)
{
String key = row1.Column0.substring(0,row1.Column0.indexOf("=")-1).trim();
String value = row1.Column0.substring(row1.Column0.indexOf("=")+2).replace("\"","").trim();
if (key.equals("Acct-Status-Type"))
{
row2.Column1 = value;
}
else if (key.equals("NAS-Port-Type"))
{
row2.Column2 = value;
}
else if (key.equals("User-Name"))
{
row2.Column3 = value;
}
else if (key.equals("NAS-Port"))
{
row2.Column4 = value;
}
else if (key.equals("Acct-Session-Id"))
{
row2.Column5 = value;
}
else if (key.equals("Event-Timestamp"))
{
row2.Column6 = value;
}
else if (key.equals("Acct-Input-Octets"))
{
row2.Column7 = value;
}
else if (key.equals("Acct-Output-Octets"))
{
row2.Column8 = value;
}
else if (key.equals("Acct-Input-Gigawords"))
{
row2.Column9 = value;
}
else if (key.equals("Acct-Output-Gigawords"))
{
row2.Column10 = value;
}
else if (key.equals("Acct-Input-Packets"))
{
row2.Column11 = value;
}
else if (key.equals("Acct-Output-Packets"))
{
row2.Column12 = value;
}
else if (key.equals("Acct-Session-Time"))
{
row2.Column13 = value;
}
else if (key.equals("Timestamp"))
{
row2.Column14 = value;
}
}
Hi Sbaer Thanks a lot for your suggestion but the issue is that nobody can assure me the fields are coming the same order every set of data. As sanvaibhav asked: Whether your columns in each block are varying? Yes Whether the order of the column is changing? Yes My initial idea was to use TurnRow component forcing it to pivot all columns not only 4