<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic tDTDValidator error with valid UTF-8 files and special characters in Talend Studio</title>
    <link>https://community.qlik.com/t5/Talend-Studio/tDTDValidator-error-with-valid-UTF-8-files-and-special/m-p/2201740#M3704</link>
    <description>&lt;P&gt;Hi there,&lt;/P&gt;
&lt;P&gt;we are using Talend Studio 6.4.1 and trying to process XML files which we want to validate against a DTD file beforehand. But we are facing problems with XML files containing special characters like the euro sign (€) or sz (ß). The tDTDValidator component runs into an error:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;[FATAL]: abc.ordersvalidate_0_1.OrdersValidate - tDTDValidator_1 Invalid byte 2 of 2-byte UTF-8 sequence.&lt;/PRE&gt;
&lt;P&gt;The error pattern looks like described in this &lt;A href="https://www.talendforge.org/forum/viewtopic.php?id=37153" target="_blank" rel="noopener nofollow noopener noreferrer"&gt;thread&lt;/A&gt;&amp;nbsp;([resolved] Error with tDTDValidator) at talendforge.&lt;/P&gt;
&lt;P&gt;It seems that the tDTDValidator uses the encoding "ISO-8859-1" for the XML file regardless of the used encoding inside the XML file.&lt;/P&gt;
&lt;PRE&gt;                String encoding = null;
                if (doctDTDValidator_1.getXmlEncoding() == null) {
                    encoding = "ISO-8859-1";
                } else {
                    encoding = doctDTDValidator_1.getXmlEncoding();
                }&lt;/PRE&gt;
&lt;P&gt;The workaround suggested in the named thread is to use the tXSDValidator which we cannot use.&lt;/P&gt;
&lt;P&gt;Is this a know bug of the tDTDValidator component and has been fixed in a later version or what kind of workarounds are around there?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;KR&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;joboro&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Sat, 16 Nov 2024 04:22:52 GMT</pubDate>
    <dc:creator>joboro1</dc:creator>
    <dc:date>2024-11-16T04:22:52Z</dc:date>
    <item>
      <title>tDTDValidator error with valid UTF-8 files and special characters</title>
      <link>https://community.qlik.com/t5/Talend-Studio/tDTDValidator-error-with-valid-UTF-8-files-and-special/m-p/2201740#M3704</link>
      <description>&lt;P&gt;Hi there,&lt;/P&gt;
&lt;P&gt;we are using Talend Studio 6.4.1 and trying to process XML files which we want to validate against a DTD file beforehand. But we are facing problems with XML files containing special characters like the euro sign (€) or sz (ß). The tDTDValidator component runs into an error:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;[FATAL]: abc.ordersvalidate_0_1.OrdersValidate - tDTDValidator_1 Invalid byte 2 of 2-byte UTF-8 sequence.&lt;/PRE&gt;
&lt;P&gt;The error pattern looks like described in this &lt;A href="https://www.talendforge.org/forum/viewtopic.php?id=37153" target="_blank" rel="noopener nofollow noopener noreferrer"&gt;thread&lt;/A&gt;&amp;nbsp;([resolved] Error with tDTDValidator) at talendforge.&lt;/P&gt;
&lt;P&gt;It seems that the tDTDValidator uses the encoding "ISO-8859-1" for the XML file regardless of the used encoding inside the XML file.&lt;/P&gt;
&lt;PRE&gt;                String encoding = null;
                if (doctDTDValidator_1.getXmlEncoding() == null) {
                    encoding = "ISO-8859-1";
                } else {
                    encoding = doctDTDValidator_1.getXmlEncoding();
                }&lt;/PRE&gt;
&lt;P&gt;The workaround suggested in the named thread is to use the tXSDValidator which we cannot use.&lt;/P&gt;
&lt;P&gt;Is this a know bug of the tDTDValidator component and has been fixed in a later version or what kind of workarounds are around there?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;KR&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;joboro&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sat, 16 Nov 2024 04:22:52 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/tDTDValidator-error-with-valid-UTF-8-files-and-special/m-p/2201740#M3704</guid>
      <dc:creator>joboro1</dc:creator>
      <dc:date>2024-11-16T04:22:52Z</dc:date>
    </item>
    <item>
      <title>Re: tDTDValidator error with valid UTF-8 files and special characters</title>
      <link>https://community.qlik.com/t5/Talend-Studio/tDTDValidator-error-with-valid-UTF-8-files-and-special/m-p/2201741#M3705</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp;The present version of tDTDValidator do not have the option to use UTF-8 and its configured for ISO-8859-1 character set.&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; If you are an enterprise customer, could you please create a support case to see the possibility to get a quick patch to make it configurable.&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; If you are using an open source version, could you please create a feature request using below link?&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&lt;A href="https://jira.talendforge.org/" target="_blank" rel="nofollow noopener noreferrer"&gt;https://jira.talendforge.org&lt;/A&gt;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;Warm Regards,&lt;BR /&gt;Nikhil Thampi&lt;/P&gt; 
&lt;P&gt;Please appreciate our Talend community members by giving Kudos for sharing their time for your query. If your query is answered, please mark the topic as resolved&lt;/P&gt;</description>
      <pubDate>Tue, 15 Oct 2019 13:22:39 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/tDTDValidator-error-with-valid-UTF-8-files-and-special/m-p/2201741#M3705</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2019-10-15T13:22:39Z</dc:date>
    </item>
    <item>
      <title>Re: tDTDValidator error with valid UTF-8 files and special characters</title>
      <link>https://community.qlik.com/t5/Talend-Studio/tDTDValidator-error-with-valid-UTF-8-files-and-special/m-p/2201742#M3706</link>
      <description>I opened a support case.&lt;BR /&gt;KR</description>
      <pubDate>Tue, 15 Oct 2019 14:53:04 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/tDTDValidator-error-with-valid-UTF-8-files-and-special/m-p/2201742#M3706</guid>
      <dc:creator>joboro1</dc:creator>
      <dc:date>2019-10-15T14:53:04Z</dc:date>
    </item>
  </channel>
</rss>

