Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Join us to spark ideas for how to put the latest capabilities into action. Register here!
cancel
Showing results for 
Search instead for 
Did you mean: 
_AnonymousUser
Specialist III
Specialist III

Normalize csv file : from rows to columns

Hello,
I have a complicated thing to do with Talend..
My input is a csv file. Here is its schema :
Id_shop;Id_cust;Id_prod;Year;Month;day1;day2;day3;...;day30;
The fields "day1" to "day30" contain quantities. The quantities of products ("Id_prod") sold by a shop ("Id_Shop") to a customer ("Id_cust") at the date "Month/Year". The field "day1" contains the quantity of the first day of the month and "day30" the quantity of the last day.
I would like to load those data in a oracle table which has this schema :
Id_shop;Id_cust;Id_prod;Date;Quantity
No problem for the Id_shop, Id_cust and Id_prod (it's the primary key) but the field Date is defined by the month, the year and the days from the csv file.
For example :
in the csv file : month = 05 and year = 2008.
in the oracle table, i should have for each key all dates from 01/05/2008 to 30/05/2008
Then, for the date 01/05/2008 the field Quantity contains the quantity written in the field "day1" of the csv file.
Is that clear?
With informatica, there is a module called normalizer which allows to do that.. Is there something equivalent in Talend?
Thanks!
Nico
Labels (2)
14 Replies
_AnonymousUser
Specialist III
Specialist III
Author

Thanks a lot! This component is perfect for my needs.
Just one question : the final schema is composed of 3 columns :
row_key
pivot_key
pivot_value
How can we have at the end the same schema as the original schema? By exploding the "row_key" column thanks to a PERL regular expression?
Thanks,
Nicolep
Anonymous
Not applicable

How can we have at the end the same schema as the original schema? By exploding the "row_key" column thanks to a PERL regular expression?

In , you will find tExtract*Fields. Currently (Talend Open Studio 3.0), you have tExtractDelimitedFields, tExtractPositionalFields and tExtractRegexFields. tUnpivotRow was designed to be followed by a tExtractDelimitedFields.
_AnonymousUser
Specialist III
Specialist III
Author

Ok thanks..
This component allows to split the "row_key" but it doesn't allow to load the "pivot_key" and the "pivot_value" and it doesn't support 2 outputs..
How can we split the "row_key" and then load all the columns (with "pivot_key" and "pivot_value")? With a TMap ?
Anonymous
Not applicable

This component allows to split the "row_key" but it doesn't allow to load the "pivot_key" and the "pivot_value" and it doesn't support 2 outputs..
How can we split the "row_key" and then load all the columns (with "pivot_key" and "pivot_value")? With a TMap ?

tExtractDelimitedFields fills several output fields from a single input field + keep other input fields unchanged (tExtractRegexFields un component reference guide shows this way to use tExtract*Fields component).
_AnonymousUser
Specialist III
Specialist III
Author

Ok that's right, when we put the same column name ("pivot_key" and "pivot_value") in the output schema of the tExtractDelimitedFile, the column are automatically loaded.
It works fine!
Thanks