Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Join us in Bucharest on Sept 18th for Qlik's AI Reality Tour! Register Now
cancel
Showing results for 
Search instead for 
Did you mean: 
olja
Contributor
Contributor

Is it possible to detect if a file is in expected encoding (utf8)?

I have a java library which consumes incoming files. The problem is, that it is only working if the file is utf8 encoded. Is there a component or a best practice to check this with talend? I want to reject files which are not utf8 encoded.

Thanks

Labels (3)
3 Replies
Anonymous
Not applicable

Hello,

So far, there is no a component or a built-in function can be used to detect the file encoding. You could write a routine in Talend to parse the file encoding.

https://stackoverflow.com/questions/499010/java-how-to-determine-the-correct-charset-encoding-of-a-s...

Best regards

Sabrina

olja
Contributor
Contributor
Author

Thanks for the reply,

 

Maybe someone has the same problem, i have solved it with an external jar

 

 

With that  org.mozilla.universalchardet.UniversalDetector

 

https://github.com/albfernandez/juniversalchardet

 

it worked quite good. I have added a Java Routine and using it my job, if it is not return true, then I will throw an exception and the file will be handled differently

Anonymous
Not applicable

Hello,

Great it works. Thanks for sharing it with us on community.

Best regards

Sabrina