<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Special characters in csv files , Encoding ANSI , character set Windows-1252 in Talend Studio</title>
    <link>https://community.qlik.com/t5/Talend-Studio/Special-characters-in-csv-files-Encoding-ANSI-character-set/m-p/2227355#M18989</link>
    <description>&lt;P&gt;Hi All,&lt;/P&gt; 
&lt;P&gt;I'm facing problems in loading data from csv file , its encoding is ANSI and character set is windows-1252.&lt;/P&gt; 
&lt;P&gt;It also has a unusual delimitter , which is the Pilcrow character&amp;nbsp;¶&lt;/P&gt; 
&lt;P&gt;I have lot of similar files , I'm able to load most of them , but for couple of files I'm getting an invalid number error (cannot convert to INT)&lt;/P&gt; 
&lt;P&gt;I found the record which was causing issue , visually I cannot see any special characters..&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;8¶1¶¶¶¶¶¶¶¶¶¶¶10000¶¶¶¶¶¶¶¶¶¶¶¶4¶¶¶¶¶2019-05-06¶4¶&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;But when I run it through Talend , I get below error&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;com.sap.db.jdbc.exceptions.JDBCDriverException: SAP DBTech JDBC: [339]: invalid number: not a valid number string '&amp;#29;': type_code=29, index=10not a valid number string '&amp;#29;&amp;#5;10000&amp;#29;': type_code=29, index=12not a valid number string '&amp;#3;&amp;#1;': type_code=29, index=16&lt;BR /&gt;at com.sap.db.jdbc.exceptions.SQLExceptionSapDB._newInstance(SQLExceptionSapDB.java:191)&lt;BR /&gt;at com.sap.db.jdbc.exceptions.SQLExceptionSapDB.newInstance(SQLExceptionSapDB.java:42)&lt;BR /&gt;at com.sap.db.jdbc.packet.HReplyPacket._buildExceptionChain(HReplyPacket.java:977)&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;I tried using replace in all my numeric fields ,&amp;nbsp;replaceAll("[^\\x00-\\x7F]", "")&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;But it does not help .&lt;/P&gt; 
&lt;P&gt;If I try to convert the file in UTF-8 on notepad++ , everything is messed up...&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;Below rows from the same file , load without any issue&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&lt;BR /&gt;1¶0¶StringOfLength100¶test¶123456¶ES¶123456789¶0001041090¶2019-12-11 00:00:00.0¶10000¶N¶10000¶10000¶StringOfLength1000¶2018-12-11 00:00:00.0¶23¶1¶2018-12-11 00:00:00.0¶Ko ¶1100¶A¶98000¶09¶7¶5¶2018-12-11 00:00:00.0¶AU¶St ame¶12¶2019-05-06¶1¶100.00&lt;BR /&gt;2¶1¶StringOfLength100¶test¶123456¶ES¶123456789¶0001041090¶2019-12-11 00:00:00.0¶10000¶N¶10000¶10000¶StringOfLength1000¶2018-12-11 00:00:00.0¶23¶1¶2018-12-11 00:00:00.0¶Sha¶1100¶A¶98000¶09¶7¶5¶2018-12-11 00:00:00.0¶AU¶St ame¶12¶2019-05-06¶1¶100.00&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;Any pointers or help ??&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="snap1.PNG" style="width: 999px;"&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009M6Zb.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/148706iD7C3EF4F544AC252/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009M6Zb.png" alt="0683p000009M6Zb.png" /&gt;&lt;/span&gt;&lt;/SPAN&gt;&lt;SPAN class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="snap2.PNG" style="width: 860px;"&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009M6Kx.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/152250iDDCFCA2678804685/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009M6Kx.png" alt="0683p000009M6Kx.png" /&gt;&lt;/span&gt;&lt;/SPAN&gt;&lt;/P&gt;</description>
    <pubDate>Sat, 16 Nov 2024 05:17:36 GMT</pubDate>
    <dc:creator>karandama2006</dc:creator>
    <dc:date>2024-11-16T05:17:36Z</dc:date>
    <item>
      <title>Special characters in csv files , Encoding ANSI , character set Windows-1252</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Special-characters-in-csv-files-Encoding-ANSI-character-set/m-p/2227355#M18989</link>
      <description>&lt;P&gt;Hi All,&lt;/P&gt; 
&lt;P&gt;I'm facing problems in loading data from csv file , its encoding is ANSI and character set is windows-1252.&lt;/P&gt; 
&lt;P&gt;It also has a unusual delimitter , which is the Pilcrow character&amp;nbsp;¶&lt;/P&gt; 
&lt;P&gt;I have lot of similar files , I'm able to load most of them , but for couple of files I'm getting an invalid number error (cannot convert to INT)&lt;/P&gt; 
&lt;P&gt;I found the record which was causing issue , visually I cannot see any special characters..&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;8¶1¶¶¶¶¶¶¶¶¶¶¶10000¶¶¶¶¶¶¶¶¶¶¶¶4¶¶¶¶¶2019-05-06¶4¶&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;But when I run it through Talend , I get below error&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;com.sap.db.jdbc.exceptions.JDBCDriverException: SAP DBTech JDBC: [339]: invalid number: not a valid number string '&amp;#29;': type_code=29, index=10not a valid number string '&amp;#29;&amp;#5;10000&amp;#29;': type_code=29, index=12not a valid number string '&amp;#3;&amp;#1;': type_code=29, index=16&lt;BR /&gt;at com.sap.db.jdbc.exceptions.SQLExceptionSapDB._newInstance(SQLExceptionSapDB.java:191)&lt;BR /&gt;at com.sap.db.jdbc.exceptions.SQLExceptionSapDB.newInstance(SQLExceptionSapDB.java:42)&lt;BR /&gt;at com.sap.db.jdbc.packet.HReplyPacket._buildExceptionChain(HReplyPacket.java:977)&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;I tried using replace in all my numeric fields ,&amp;nbsp;replaceAll("[^\\x00-\\x7F]", "")&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;But it does not help .&lt;/P&gt; 
&lt;P&gt;If I try to convert the file in UTF-8 on notepad++ , everything is messed up...&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;Below rows from the same file , load without any issue&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&lt;BR /&gt;1¶0¶StringOfLength100¶test¶123456¶ES¶123456789¶0001041090¶2019-12-11 00:00:00.0¶10000¶N¶10000¶10000¶StringOfLength1000¶2018-12-11 00:00:00.0¶23¶1¶2018-12-11 00:00:00.0¶Ko ¶1100¶A¶98000¶09¶7¶5¶2018-12-11 00:00:00.0¶AU¶St ame¶12¶2019-05-06¶1¶100.00&lt;BR /&gt;2¶1¶StringOfLength100¶test¶123456¶ES¶123456789¶0001041090¶2019-12-11 00:00:00.0¶10000¶N¶10000¶10000¶StringOfLength1000¶2018-12-11 00:00:00.0¶23¶1¶2018-12-11 00:00:00.0¶Sha¶1100¶A¶98000¶09¶7¶5¶2018-12-11 00:00:00.0¶AU¶St ame¶12¶2019-05-06¶1¶100.00&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;Any pointers or help ??&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="snap1.PNG" style="width: 999px;"&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009M6Zb.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/148706iD7C3EF4F544AC252/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009M6Zb.png" alt="0683p000009M6Zb.png" /&gt;&lt;/span&gt;&lt;/SPAN&gt;&lt;SPAN class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="snap2.PNG" style="width: 860px;"&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009M6Kx.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/152250iDDCFCA2678804685/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009M6Kx.png" alt="0683p000009M6Kx.png" /&gt;&lt;/span&gt;&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Sat, 16 Nov 2024 05:17:36 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Special-characters-in-csv-files-Encoding-ANSI-character-set/m-p/2227355#M18989</guid>
      <dc:creator>karandama2006</dc:creator>
      <dc:date>2024-11-16T05:17:36Z</dc:date>
    </item>
    <item>
      <title>Re: Special characters in csv files , Encoding ANSI , character set Windows-1252</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Special-characters-in-csv-files-Encoding-ANSI-character-set/m-p/2227356#M18990</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp; &amp;nbsp; The issue is for the column containing 10000 since it is having some padded characters. Could you please try below function to remove them?&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;row1.data.replaceAll("[^\\d]", "")&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;Warm Regards,&lt;BR /&gt;Nikhil Thampi&lt;/P&gt; 
&lt;P&gt;Please appreciate our Talend community members by giving Kudos for sharing their time for your query. If your query is answered, please mark the topic as resolved &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 12 Jul 2019 20:07:35 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Special-characters-in-csv-files-Encoding-ANSI-character-set/m-p/2227356#M18990</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2019-07-12T20:07:35Z</dc:date>
    </item>
    <item>
      <title>Re: Special characters in csv files , Encoding ANSI , character set Windows-1252</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Special-characters-in-csv-files-Encoding-ANSI-character-set/m-p/2227357#M18991</link>
      <description>Thanks that works , I've used below code&lt;BR /&gt;&lt;BR /&gt;row1.data.equals("") ? null : Integer.parseInt(row1.data.replaceAll("[^\\d]", ""))</description>
      <pubDate>Mon, 15 Jul 2019 07:40:08 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Special-characters-in-csv-files-Encoding-ANSI-character-set/m-p/2227357#M18991</guid>
      <dc:creator>karandama2006</dc:creator>
      <dc:date>2019-07-15T07:40:08Z</dc:date>
    </item>
  </channel>
</rss>

