<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: issue with unicode character in JSON with tFileInputJSON_1 in Talend Studio</title>
    <link>https://community.qlik.com/t5/Talend-Studio/issue-with-unicode-character-in-JSON-with-tFileInputJSON-1/m-p/2320716#M90830</link>
    <description>Thanks dj.&lt;BR /&gt;In my case I checked the incoming file and see it written as&amp;nbsp;"\u001BSam", which I interpreted as [ESC]Sam&lt;BR /&gt;That was why I tried to replace "\u001b".&lt;BR /&gt;But basically, even if I got the replace to work, that would only help if I did that for all possible breaking characters. &amp;nbsp;Do you know if it is objecting to any unicode character? or just the fact that it is a control character?&lt;BR /&gt;I haven't tried changing the encoding - I will explore that. &amp;nbsp;Though ideally i would like to strip or ignore such control characters, as opposed to allow them through...</description>
    <pubDate>Tue, 20 Dec 2016 21:56:59 GMT</pubDate>
    <dc:creator>soowork</dc:creator>
    <dc:date>2016-12-20T21:56:59Z</dc:date>
    <item>
      <title>issue with unicode character in JSON with tFileInputJSON_1</title>
      <link>https://community.qlik.com/t5/Talend-Studio/issue-with-unicode-character-in-JSON-with-tFileInputJSON-1/m-p/2320714#M90828</link>
      <description>&lt;P&gt;I am using Talend BigData 5.4.1 (5.4.1.r111943). &amp;nbsp;Similar to &lt;A href="https://community.qlik.com/s/feed/0D53p00007vCj9ECAS" target="_blank"&gt;https://community.talend.com/t5/Design-and-Development/Invalid-XML-character-in-json-file/m-p/66674&lt;/A&gt;, I am encountering an error when trying to consume a JSON file that has a unicode control character in it. &amp;nbsp;In my case it failed with "An invalid XML character (Unicode: 0x1b) was found in the element content of the document." in tFileInputJSON.&lt;BR /&gt;&lt;BR /&gt;Is there yet any fix or work around for this issue?&lt;BR /&gt;&lt;BR /&gt;Like&amp;nbsp;GuruGulabKhatri, I also tried to strip the unicode character out in a tMap and had no luck (e.g. row1.line.replaceAll("\\u001b", "")).&lt;BR /&gt;&lt;BR /&gt;If there is no fix or work around, it is known exactly which Unicode characters will cause&amp;nbsp;tFileInputJSON to fail?&lt;BR /&gt;&lt;BR /&gt;Thanks in advance.&lt;/P&gt;</description>
      <pubDate>Mon, 19 Dec 2016 20:52:12 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/issue-with-unicode-character-in-JSON-with-tFileInputJSON-1/m-p/2320714#M90828</guid>
      <dc:creator>soowork</dc:creator>
      <dc:date>2016-12-19T20:52:12Z</dc:date>
    </item>
    <item>
      <title>Re: issue with unicode character in JSON with tFileInputJSON_1</title>
      <link>https://community.qlik.com/t5/Talend-Studio/issue-with-unicode-character-in-JSON-with-tFileInputJSON-1/m-p/2320715#M90829</link>
      <description>Hi soowork
&lt;BR /&gt;
&lt;BR /&gt;&amp;gt;I also tried to strip the unicode character out in a tMap and had no luck (e.g. row1.line.replaceAll("\\u001b", "")).
&lt;BR /&gt;0x1b is not necessarly&amp;nbsp; 
&lt;A target="_blank"&gt;\u001&lt;/A&gt;b (for example is \u00b7 = 0xc2b7), you Need to find an Translation table like here: 
&lt;A href="https://en.wikipedia.org/wiki/List_of_Unicode_characters" target="_blank" rel="nofollow noopener noreferrer"&gt;https://en.wikipedia.org/wiki/List_of_Unicode_characters&lt;/A&gt;
&lt;BR /&gt;
&lt;BR /&gt;&amp;gt;If there is no fix or work around, it is known exactly which Unicode characters will cause&amp;nbsp;tFileInputJSON to fail?
&lt;BR /&gt;I think this might be the case when there is no equivalent of the Unicode character in your targetcodepage ( which i would expect to be ISO-8859-15) , but this is only a guess as i dont have talend bigdata .
&lt;BR /&gt;
&lt;BR /&gt;in TIS 5.4 my json component hat in the "Advance Settings" section the posibility to Switch the Encoding, have you tried that ?
&lt;BR /&gt;
&lt;BR /&gt;cheers
&lt;BR /&gt;dj</description>
      <pubDate>Tue, 20 Dec 2016 13:36:38 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/issue-with-unicode-character-in-JSON-with-tFileInputJSON-1/m-p/2320715#M90829</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2016-12-20T13:36:38Z</dc:date>
    </item>
    <item>
      <title>Re: issue with unicode character in JSON with tFileInputJSON_1</title>
      <link>https://community.qlik.com/t5/Talend-Studio/issue-with-unicode-character-in-JSON-with-tFileInputJSON-1/m-p/2320716#M90830</link>
      <description>Thanks dj.&lt;BR /&gt;In my case I checked the incoming file and see it written as&amp;nbsp;"\u001BSam", which I interpreted as [ESC]Sam&lt;BR /&gt;That was why I tried to replace "\u001b".&lt;BR /&gt;But basically, even if I got the replace to work, that would only help if I did that for all possible breaking characters. &amp;nbsp;Do you know if it is objecting to any unicode character? or just the fact that it is a control character?&lt;BR /&gt;I haven't tried changing the encoding - I will explore that. &amp;nbsp;Though ideally i would like to strip or ignore such control characters, as opposed to allow them through...</description>
      <pubDate>Tue, 20 Dec 2016 21:56:59 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/issue-with-unicode-character-in-JSON-with-tFileInputJSON-1/m-p/2320716#M90830</guid>
      <dc:creator>soowork</dc:creator>
      <dc:date>2016-12-20T21:56:59Z</dc:date>
    </item>
  </channel>
</rss>

