Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi All,
I'm facing problems in loading data from csv file , its encoding is ANSI and character set is windows-1252.
It also has a unusual delimitter , which is the Pilcrow character ¶
I have lot of similar files , I'm able to load most of them , but for couple of files I'm getting an invalid number error (cannot convert to INT)
I found the record which was causing issue , visually I cannot see any special characters..
8¶1¶¶¶¶¶¶¶¶¶¶¶10000¶¶¶¶¶¶¶¶¶¶¶¶4¶¶¶¶¶2019-05-06¶4¶
But when I run it through Talend , I get below error
com.sap.db.jdbc.exceptions.JDBCDriverException: SAP DBTech JDBC: [339]: invalid number: not a valid number string '': type_code=29, index=10not a valid number string '10000': type_code=29, index=12not a valid number string '': type_code=29, index=16
at com.sap.db.jdbc.exceptions.SQLExceptionSapDB._newInstance(SQLExceptionSapDB.java:191)
at com.sap.db.jdbc.exceptions.SQLExceptionSapDB.newInstance(SQLExceptionSapDB.java:42)
at com.sap.db.jdbc.packet.HReplyPacket._buildExceptionChain(HReplyPacket.java:977)
I tried using replace in all my numeric fields , replaceAll("[^\\x00-\\x7F]", "")
But it does not help .
If I try to convert the file in UTF-8 on notepad++ , everything is messed up...
Below rows from the same file , load without any issue
1¶0¶StringOfLength100¶test¶123456¶ES¶123456789¶0001041090¶2019-12-11 00:00:00.0¶10000¶N¶10000¶10000¶StringOfLength1000¶2018-12-11 00:00:00.0¶23¶1¶2018-12-11 00:00:00.0¶Ko ¶1100¶A¶98000¶09¶7¶5¶2018-12-11 00:00:00.0¶AU¶St ame¶12¶2019-05-06¶1¶100.00
2¶1¶StringOfLength100¶test¶123456¶ES¶123456789¶0001041090¶2019-12-11 00:00:00.0¶10000¶N¶10000¶10000¶StringOfLength1000¶2018-12-11 00:00:00.0¶23¶1¶2018-12-11 00:00:00.0¶Sha¶1100¶A¶98000¶09¶7¶5¶2018-12-11 00:00:00.0¶AU¶St ame¶12¶2019-05-06¶1¶100.00
Any pointers or help ??
Hi,
The issue is for the column containing 10000 since it is having some padded characters. Could you please try below function to remove them?
row1.data.replaceAll("[^\\d]", "")
Warm Regards,
Nikhil Thampi
Please appreciate our Talend community members by giving Kudos for sharing their time for your query. If your query is answered, please mark the topic as resolved 🙂
Hi,
The issue is for the column containing 10000 since it is having some padded characters. Could you please try below function to remove them?
row1.data.replaceAll("[^\\d]", "")
Warm Regards,
Nikhil Thampi
Please appreciate our Talend community members by giving Kudos for sharing their time for your query. If your query is answered, please mark the topic as resolved 🙂