Skip to main content
Announcements
Introducing a new Enhanced File Management feature in Qlik Cloud! GET THE DETAILS!
cancel
Showing results for 
Search instead for 
Did you mean: 
SNasif
Contributor III
Contributor III

Pivot rows to columns without needing delimited file

Hey Community,

I have the following issue, I have a job on Talend that does a series of transformations before it finally generates the following sample data:

Attribute | Value | UID

ATT1 | a1 | entry1

ATT2 | a2 | entry1

ATT3 | a3 | entry1

ATT4 | a4 | entry1

ATT1 | a5 | entry2

....etc.

for each UID, I get a unique entry that is then parsed and tabularized with the structure seen above. I could have a whole bunch of entries with differing numbers of attributes (a 4th column named class guides where each entry gets filled, but that's slightly irrelevant). In short, what I'm trying to achieve is I simply want to Pivot/Transpose the rows so the attributes become the columns, with each UID holding 1 row with all necessary information filled in. I am aware that the fact that different entries might have different columns, but that is taken care of in the job as it maps similar entries to the same DB Table, but before I input the info into the DB table, I need to have it pivoted back to the following structure:

ATT1 | ATT2 | ATT3 | ATT4 | UID

a1 | a2 | a3 | a4 | entry1

I've tried the denormalize component but that didnt seem to work (im worried im missing the concept of how to use it properly?) and the tpivottocolumnsdelimited works perfectly, except it generates the output as a csv while I need the output to be directly sent into a tmap component then a DB table.

Thank you for your support and help in advance!

 

EDIT1: I know I can then map the delimited file back in directly using 'on subjob ok' but I am working with huge amounts of data on an organizational level with jobs running around the clock for different teams and clients and it wouldn't be ideal storing tons of csv sheets and mapping them back in, thank you again!

Labels (5)
1 Reply
anselmopeixoto
Partner - Creator III
Partner - Creator III

Hi @Sam Nsaif​ 

 

You can achieve this by placing the following components after your input sample data:

 

_input_ --> tAggregateRow --> tReplace --> tSplitRow --> _output_

 

On tAggregateRow component, set UID in "Group by" section. And in Operations section, put the Attribute and Value columns both using "list" Function.

 

On tReplace component, set Attribute and Value columns on "Search/Replace" section. Then set Search parameter for "," String and in "Replace with" put "|" String. Also uncheck "Whole word" option. This component is optional and it only replaces the commas generated by tAggregateRow list function with a pipe character.

 

Finally, on tSplitRow component, set two columns on its output schema (for example: Attributes and UID) and then, on "Columns mapping" section, add two rows. On Attributes column, put rowN.Attribute for the first row and rowN.Value for the second, where rowN is the input row name for the tSplitRow component and Attribute/Value are columns from its input schema. And for UID output column, put "UID" (a fixed String) for the first row matching the rowN.Attribute and rowN.UID for the second row matching the rowN.Value.

 

The output result will look like the following, considering a dynamic number of Attributes/Values for each UID. You can remove the first row set on tSplitRow if you want to remove the headers:

 

ATT1 |UID

 a5 | entry2

ATT1 |ATT2 |ATT3 |ATT4 |UID

 a1 | a2 | a3 | a4 | entry1