<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Open Studio for DQ can not handle special characters in CSV-File encoded as utf-8 in Data Quality</title>
    <link>https://community.qlik.com/t5/Data-Quality/Open-Studio-for-DQ-can-not-handle-special-characters-in-CSV-File/m-p/2196850#M146</link>
    <description>&lt;P&gt;Hi,&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;I'm using the Talend Open Studio for Data Quality Version 6.5.1 to analyze the quality of data in a csv file which is encoded in UTF-8. If I select the indicator 'Soundex Frequency' for a column which values contains special characters like "ü" and "é" and run the analysis I get the following error message:&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;PRE&gt;2018-05-04 17:14:20,232 ERROR org.talend.dq.analysis.AnalysisExecutor  - java.lang.IllegalArgumentException: The character is not mapped: Ü
java.lang.IllegalArgumentException: The character is not mapped: Ü
	at org.apache.commons.codec.language.Soundex.map(Soundex.java:226)
	at org.apache.commons.codec.language.Soundex.getMappingCode(Soundex.java:180)
	at org.apache.commons.codec.language.Soundex.soundex(Soundex.java:264)
	at org.talend.dataquality.indicators.impl.SoundexFreqIndicatorImpl.handle(SoundexFreqIndicatorImpl.java:283)
	at org.talend.dq.indicators.DelimitedFileIndicatorEvaluator.handleByARow(DelimitedFileIndicatorEvaluator.java:335)
	at org.talend.dq.indicators.DelimitedFileIndicatorEvaluator.useCsvReader(DelimitedFileIndicatorEvaluator.java:257)
	at org.talend.dq.indicators.DelimitedFileIndicatorEvaluator.executeSqlQuery(DelimitedFileIndicatorEvaluator.java:115)
	at org.talend.dq.indicators.Evaluator.evaluateIndicators(Evaluator.java:146)
	at org.talend.dq.indicators.Evaluator.evaluateIndicators(Evaluator.java:207)
	at org.talend.dq.analysis.DelimitedFileAnalysisExecutor.runAnalysis(DelimitedFileAnalysisExecutor.java:70)
	at org.talend.dq.analysis.AnalysisExecutor.execute(AnalysisExecutor.java:146)
	at org.talend.dq.analysis.AnalysisExecutorSelector.executeAnalysis(AnalysisExecutorSelector.java:171)
	at org.talend.dataprofiler.core.ui.action.actions.RunAnalysisAction$1.runInWorkspace(RunAnalysisAction.java:222)
	at org.eclipse.core.internal.resources.InternalWorkspaceJob.run(InternalWorkspaceJob.java:38)
	at org.eclipse.core.internal.jobs.Worker.run(Worker.java:54)&lt;/PRE&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;I've already tried to solve the problem by the solution of this post: &lt;A href="https://community.talend.com/t5/Design-and-Development/Handling-special-characters/m-p/25169#M4268" target="_blank"&gt;https://community.talend.com/t5/Design-and-Development/Handling-special-characters/m-p/25169#M4268&lt;/A&gt;&lt;/P&gt; 
&lt;P&gt;and I checked "Allow specific characters (UTF8,...) for columns of schemas" under Window / Preferences / Talend / Specific Settings.&lt;/P&gt; 
&lt;P&gt;Neither of the solutions worked&amp;nbsp;for&amp;nbsp;me.&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;Is there any workaround to solve the problem?&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;Thanks in advance&lt;/P&gt; 
&lt;P&gt;Frank&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Sat, 16 Nov 2024 08:18:32 GMT</pubDate>
    <dc:creator>Anonymous</dc:creator>
    <dc:date>2024-11-16T08:18:32Z</dc:date>
    <item>
      <title>Open Studio for DQ can not handle special characters in CSV-File encoded as utf-8</title>
      <link>https://community.qlik.com/t5/Data-Quality/Open-Studio-for-DQ-can-not-handle-special-characters-in-CSV-File/m-p/2196850#M146</link>
      <description>&lt;P&gt;Hi,&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;I'm using the Talend Open Studio for Data Quality Version 6.5.1 to analyze the quality of data in a csv file which is encoded in UTF-8. If I select the indicator 'Soundex Frequency' for a column which values contains special characters like "ü" and "é" and run the analysis I get the following error message:&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;PRE&gt;2018-05-04 17:14:20,232 ERROR org.talend.dq.analysis.AnalysisExecutor  - java.lang.IllegalArgumentException: The character is not mapped: Ü
java.lang.IllegalArgumentException: The character is not mapped: Ü
	at org.apache.commons.codec.language.Soundex.map(Soundex.java:226)
	at org.apache.commons.codec.language.Soundex.getMappingCode(Soundex.java:180)
	at org.apache.commons.codec.language.Soundex.soundex(Soundex.java:264)
	at org.talend.dataquality.indicators.impl.SoundexFreqIndicatorImpl.handle(SoundexFreqIndicatorImpl.java:283)
	at org.talend.dq.indicators.DelimitedFileIndicatorEvaluator.handleByARow(DelimitedFileIndicatorEvaluator.java:335)
	at org.talend.dq.indicators.DelimitedFileIndicatorEvaluator.useCsvReader(DelimitedFileIndicatorEvaluator.java:257)
	at org.talend.dq.indicators.DelimitedFileIndicatorEvaluator.executeSqlQuery(DelimitedFileIndicatorEvaluator.java:115)
	at org.talend.dq.indicators.Evaluator.evaluateIndicators(Evaluator.java:146)
	at org.talend.dq.indicators.Evaluator.evaluateIndicators(Evaluator.java:207)
	at org.talend.dq.analysis.DelimitedFileAnalysisExecutor.runAnalysis(DelimitedFileAnalysisExecutor.java:70)
	at org.talend.dq.analysis.AnalysisExecutor.execute(AnalysisExecutor.java:146)
	at org.talend.dq.analysis.AnalysisExecutorSelector.executeAnalysis(AnalysisExecutorSelector.java:171)
	at org.talend.dataprofiler.core.ui.action.actions.RunAnalysisAction$1.runInWorkspace(RunAnalysisAction.java:222)
	at org.eclipse.core.internal.resources.InternalWorkspaceJob.run(InternalWorkspaceJob.java:38)
	at org.eclipse.core.internal.jobs.Worker.run(Worker.java:54)&lt;/PRE&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;I've already tried to solve the problem by the solution of this post: &lt;A href="https://community.talend.com/t5/Design-and-Development/Handling-special-characters/m-p/25169#M4268" target="_blank"&gt;https://community.talend.com/t5/Design-and-Development/Handling-special-characters/m-p/25169#M4268&lt;/A&gt;&lt;/P&gt; 
&lt;P&gt;and I checked "Allow specific characters (UTF8,...) for columns of schemas" under Window / Preferences / Talend / Specific Settings.&lt;/P&gt; 
&lt;P&gt;Neither of the solutions worked&amp;nbsp;for&amp;nbsp;me.&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;Is there any workaround to solve the problem?&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;Thanks in advance&lt;/P&gt; 
&lt;P&gt;Frank&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sat, 16 Nov 2024 08:18:32 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Data-Quality/Open-Studio-for-DQ-can-not-handle-special-characters-in-CSV-File/m-p/2196850#M146</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2024-11-16T08:18:32Z</dc:date>
    </item>
    <item>
      <title>Re: Open Studio for DQ can not handle special characters in CSV-File encoded as utf-8</title>
      <link>https://community.qlik.com/t5/Data-Quality/Open-Studio-for-DQ-can-not-handle-special-characters-in-CSV-File/m-p/2196851#M147</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;
&lt;P&gt;Have you tried to add -Dfile.encoding=utf8 in the ini (config file) and restart your studio to see if it works?&lt;/P&gt;
&lt;P&gt;Best regards&lt;/P&gt;
&lt;P&gt;Sabrina&lt;/P&gt;</description>
      <pubDate>Mon, 14 May 2018 08:20:10 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Data-Quality/Open-Studio-for-DQ-can-not-handle-special-characters-in-CSV-File/m-p/2196851#M147</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2018-05-14T08:20:10Z</dc:date>
    </item>
    <item>
      <title>Re: Open Studio for DQ can not handle special characters in CSV-File encoded as utf-8</title>
      <link>https://community.qlik.com/t5/Data-Quality/Open-Studio-for-DQ-can-not-handle-special-characters-in-CSV-File/m-p/2196852#M148</link>
      <description>hi&lt;BR /&gt;we don't support that the indicator 'Soundex Frequency' to run&lt;BR /&gt;for a column which values contains special characters like "ü" and "é" and Chinese/Japanese characters.&lt;BR /&gt;get this error is normal, we will not fix this</description>
      <pubDate>Wed, 15 Aug 2018 08:09:34 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Data-Quality/Open-Studio-for-DQ-can-not-handle-special-characters-in-CSV-File/m-p/2196852#M148</guid>
      <dc:creator>msjian</dc:creator>
      <dc:date>2018-08-15T08:09:34Z</dc:date>
    </item>
  </channel>
</rss>

