Skip to main content
Announcements
See what Drew Clarke has to say about the Qlik Talend Cloud launch! READ THE BLOG
cancel
Showing results for 
Search instead for 
Did you mean: 
Twisted
Contributor
Contributor

Import CSV and convert data for output to PostgreSQL

Dears,

 

I have trouble finding certain "blocks" in order to build something I need to create in Talend.

I've got 2 CSV files. Both are imported as "file delimited" and attached to a tLogRow.

 

From Alpha.csv I would need to convert the data as per below to PostgreSQL;

  • Column02 has either "1" or "2" mentioned, 1 should be converted to "Apple" and "2" to "Pear"
  • Column05 has a number. Either I.E. "112" or numbers like "54A". I would to output this data without the letters (hence "54" instead of "54A").
  • Column06 has 6 digits. And the 6 digits need to change in the string of Beta.csv. Beta.csv has 2 columns, one columns with strings the other column is equal to the 6 digits.

 

Any help on the matter would be appreciated. I've tried to search the official documentation and in Google.

But unfortunately I'm not able to find the sections of the documentation that could help me further. Most likely I'm searching wrong here...

 

Many thanks in advance!

Labels (2)
1 Solution

Accepted Solutions
gjeremy1617088143

in a csv file, you have one Less delimiter than the number of column.

if you have 5 column you have 4 delimiters.

if you have 2 delimiters in a row it’s maybe because the field is null for this row,

 

for eg you have a csv file with

city name and country and continent

 

city name;country;continent

Paris;France;Europa

New york;USA;America

Alger;;Africa

 

here the country is missing but delimiter had to be there to parse properly the data

 

if you have more or the same number of delimiter than column number your row have bad data you have to reject this one or clean up before process

 

View solution in original post

10 Replies
gjeremy1617088143

Hi @Taylor O​ ,

Column02:

 

"1".equals(row1.Column02)?"Apple":"2".equals(row1.Column02)?"Pear":"defaultValue"

 

Column05:

 

row1.Column05.replaceAll("\\D","")

 

for column 06 i don't understand what you want, could you bring more informations of what you want ?

 

you want to convert digit to String ?

 

String.valueOf(row1.Column06)

Twisted
Contributor
Contributor
Author

Many thanks @guenneguez jeremy​ 16170

 

And where would I place this code? A link to the documentation would be more than fine as well :).

I forgot to mention... Within the timelimit given, I'm not able to complete an entire course on Talend, however seeing it's possibilities I certainly will once I have this assignment completed.

 

Regarding "row1.Column05.replaceAll("\\D","")"

Will this replace all letters or just the letter D?

 

Regarding Column06 perhaps I was not as clear as I thought.

It is similar to like Vlookup in Excel. Alpha.csv has the column06 with numbers of 6 digits.

In Beta.csv there are 2 columns. 1 has numbers equal to the 6 digits of Alpha's column06 and the second column has a string.

Basically the 6 digits of column06 needs to be converted to the corresponding string that is matched in Beta.csv. As a reference kindly see attached image (do note I made columns in Excel, they don't exist in the original CSV for obvious reasons). 0695b00000SsK8tAAF.png

gjeremy1617088143

you have to use a tMap component

https://help.talend.com/r/en-US/8.0/tmap/tmap?tocId=xh4Ajmpb9XV6L2lTX0jX~Q

 

you can make a join between the two csv on String.ValueOf(Column06) and Column02 of beta

https://help.talend.com/r/en-US/8.0/tmap/differences-between-unique-match-first-match-and-all-matche...

 

\D is all non digit char so replaceAll("\\D","") will suppress all non digit char of the String

 

You have to put the code for column02 and 05 in the output fields of the tMap

Twisted
Contributor
Contributor
Author

@guenneguez jeremy​ 

 

Many thanks again!

I seemed to keep getting an error and after rebuilding the tMap in the end it was something simple with a wrong data type 😅 ...

I'm now only stuck at the joining of the two csv's. (sidenote I altered the names earlier on, to avoid the person who I am making it for seeing I asked for help, but seems as I'm now this far that I will only make it more confusing when I keep mentioning Alpha, Beta, and Apple and Pear...)

 

Regarding the tFixedFlowInput, I think I have it working. However I read that you can only test it for real when connecting to a DB. Unless I misunderstood the following topic;

https://community.talend.com/s/feed/0D73p000004kBRWCA2?language=en_US

 

Below is my Job (as I would think it should be).

0695b00000SsNXSAA3.pngBelow is tFixedFlowInput_1

0695b00000SsNd1AAF.png 

And below is the error which I get if I remove "SQL_Connect"

0695b00000SsNUiAAN.png 

  • row2.Column0 has the string which needs to be put in "Bron" in the database.
  • row2.Column01 has the string equal to that of row1.Bron.

 

Is there any way that I can be certain if it works without connecting to the DB and that it writes it to the DB?

Or is there something else that might be wrong?

gjeremy1617088143

i think you have selected personalized content, but you don't fill the content part with some value.

also if row1.Bron is a String and Column1 is an integer you have to convert Bron to Integer in the expression key of the inner join

 

Integer.valueOf(row1.Bron)

Twisted
Contributor
Contributor
Author

gjeremy1617088143

@guenneguez jeremy​ 

 

Many thanks once again for replying to me!

 

[Q]i think you have selected personalized content, but you don't fill the content part with some value.[/Q]

Not following this part, do you mean for the SQL connection? As I'm unable to create a blank DB Connection (left menu).

 

[Q]also if row1.Bron is a String and Column1 is an integer you have to convert Bron to Integer in the expression key of the inner join[/Q]

As a quickfix, I made row1.Bron a string. To keep it as simple as possible.

Unless i'm mistaken, a CSV does not define the data-type correct? If so, then I do need to include Integer.valueOf(row1.Bron)

As I'm not allowed to amend the original CSV files.

 

EDIT:

I put in a tDBOuput_1 but am getting the same error as before.

So feel I'm doing something wrong in the end nevertheless. Or could this be an error of Talend, regarding being unable to write it to a DB due to no connection?

gjeremy1617088143

The error Come from the tFixedFlowInput. You have filled no content in the personalized content field.

if you want to write into a db you can use a tDBOutput component.You can set the connection informations directly in the component

 

by default you can set all the datatype of a csv to String cause it’ s Just text separated by a delimiter, you can also set other type if you think the tFileInputDelimited will parse it in the good type

Twisted
Contributor
Contributor
Author

@guenneguez jeremy​ 

I think it was good that I took a break yesterday. Redid all that you said and now it does work.

Seems I was totally overlooking the content field in the manual *facepalm*

Output works as it should.

Despite this might have been peanuts for you, a HUGE thank you for your patience and help!

 

It still seems that I have trouble with a separator.

For some reason there is a double separator. See below line as example;

7277536-9A;Mevrouw;Naereboutstraat;49;;Aanbevolen door derden;

 

gjeremy1617088143

in a csv file, you have one Less delimiter than the number of column.

if you have 5 column you have 4 delimiters.

if you have 2 delimiters in a row it’s maybe because the field is null for this row,

 

for eg you have a csv file with

city name and country and continent

 

city name;country;continent

Paris;France;Europa

New york;USA;America

Alger;;Africa

 

here the country is missing but delimiter had to be there to parse properly the data

 

if you have more or the same number of delimiter than column number your row have bad data you have to reject this one or clean up before process