Skip to main content
Announcements
Accelerate Your Success: Fuel your data and AI journey with the right services, delivered by our experts. Learn More
cancel
Showing results for 
Search instead for 
Did you mean: 
lcupito
Contributor
Contributor

Problems to handle dirty data

Hello,

i'm trying to handle a dirty data that i have in a csv file. I have an integer column but some values are "n.a."

i've tried to handle this error in a tmap but i couldn't do that.

Labels (3)
7 Replies
Anonymous
Not applicable

Can you send us more information on this? We will need to see an example of the data that you are using (an example, not private data) and a couple of screenshots of your job, so that we can see what you are doing at the moment.

jlolling
Creator III
Creator III

Yes, this is sometimes a mess. In your case I suggest to use as column type of the csv source column String as type.

You can use the component tConvertType and on its output set the actual integer columns as Integer typed.

Take care the tConvertType is not set to Die on Error and you can also lead the mistaken values to a reject flow.

 

Another solution is using a routine.

I have created a lot of them and you can use the source code as you like. E.g. in your use case you can use the NumberUtil and here the method getFailSaveInt(...) in the tMap to convert the String fail-save into a Number.

Take a look in this project and here in the source folder src/routines: https://github.com/jlolling/talend_routines

To install a routine create a new routine with the same name you see here and after successfully create (with a dummy method) replace the whole content with the content of the source code in this project.

lcupito
Contributor
Contributor
Author

Hi,

this is the job0695b00000aFPD9AAO.jpgthe tmap structure

0695b00000aFPDJAA4.pngdirty data i have

0695b00000aFPDOAA4.jpgthe errors

0695b00000aFPRBAA4.jpg

jlolling
Creator III
Creator III

Ok, I would put right after the tFileInputDelimited a tConvertType component and setup the problematic column as String typed and the outgoing column of the tConvertType as integer column.

jlolling
Creator III
Creator III

Or you install the routines I have described and use them in the expression of the output.

Anonymous
Not applicable

@Luca Cupitò​ it looks like @Jan Lolling​ got here first. I'd do exactly the same as he suggested. He deserves a best answer for this 😉

RJLC
Contributor
Contributor

Maybe you can use tReplace between tFileInputDelimited and tMap and replace “n/a” value by a valid value. Sample as 0.

 

the integer field must be defined as a string in the tFileInputDelimited, for uses it into the tReplace component, and convert it after to integer when do you need it.

 

best regards