<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Tmatchgroup Limit? in Data Quality</title>
    <link>https://community.qlik.com/t5/Data-Quality/Tmatchgroup-Limit/m-p/2277903#M3301</link>
    <description>Hi, 
&lt;BR /&gt; 
&lt;FONT size="2"&gt;&lt;FONT face="Calibri, sans-serif"&gt;For a large set of data, could you please try to store the data on disk instead of memory on tMatchgroup?&lt;/FONT&gt;&lt;/FONT&gt; 
&lt;BR /&gt; 
&lt;FONT size="2"&gt;&lt;FONT face="Calibri, sans-serif"&gt;Here is a KB article about:&lt;A href="https://help.talend.com/pages/viewpage.action?pageId=190513241" target="_blank" rel="nofollow noopener noreferrer"&gt;TalendHelpCenter:Exception: outOfMemory&lt;/A&gt;&lt;/FONT&gt;&lt;/FONT&gt; 
&lt;BR /&gt;Best regards 
&lt;BR /&gt;Sabrina 
&lt;BR /&gt; 
&lt;BR /&gt; 
&lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009MDkT.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/155678iEC9D0C34441355A6/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009MDkT.png" alt="0683p000009MDkT.png" /&gt;&lt;/span&gt;</description>
    <pubDate>Thu, 14 Apr 2016 04:22:15 GMT</pubDate>
    <dc:creator>Anonymous</dc:creator>
    <dc:date>2016-04-14T04:22:15Z</dc:date>
    <item>
      <title>Tmatchgroup Limit?</title>
      <link>https://community.qlik.com/t5/Data-Quality/Tmatchgroup-Limit/m-p/2277902#M3300</link>
      <description>Hello,
&lt;BR /&gt;I'am trying to deduplicate 500 000 lines with tmatchgroup component, each times i ve an Exception in thread "main" java.lang.OutOfMemoryError. What's the limit for a tmatchgroup?
&lt;BR /&gt;Thanks</description>
      <pubDate>Wed, 13 Apr 2016 13:30:15 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Data-Quality/Tmatchgroup-Limit/m-p/2277902#M3300</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2016-04-13T13:30:15Z</dc:date>
    </item>
    <item>
      <title>Re: Tmatchgroup Limit?</title>
      <link>https://community.qlik.com/t5/Data-Quality/Tmatchgroup-Limit/m-p/2277903#M3301</link>
      <description>Hi, 
&lt;BR /&gt; 
&lt;FONT size="2"&gt;&lt;FONT face="Calibri, sans-serif"&gt;For a large set of data, could you please try to store the data on disk instead of memory on tMatchgroup?&lt;/FONT&gt;&lt;/FONT&gt; 
&lt;BR /&gt; 
&lt;FONT size="2"&gt;&lt;FONT face="Calibri, sans-serif"&gt;Here is a KB article about:&lt;A href="https://help.talend.com/pages/viewpage.action?pageId=190513241" target="_blank" rel="nofollow noopener noreferrer"&gt;TalendHelpCenter:Exception: outOfMemory&lt;/A&gt;&lt;/FONT&gt;&lt;/FONT&gt; 
&lt;BR /&gt;Best regards 
&lt;BR /&gt;Sabrina 
&lt;BR /&gt; 
&lt;BR /&gt; 
&lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009MDkT.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/155678iEC9D0C34441355A6/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009MDkT.png" alt="0683p000009MDkT.png" /&gt;&lt;/span&gt;</description>
      <pubDate>Thu, 14 Apr 2016 04:22:15 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Data-Quality/Tmatchgroup-Limit/m-p/2277903#M3301</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2016-04-14T04:22:15Z</dc:date>
    </item>
    <item>
      <title>Re: Tmatchgroup Limit?</title>
      <link>https://community.qlik.com/t5/Data-Quality/Tmatchgroup-Limit/m-p/2277904#M3302</link>
      <description>Hello,&lt;BR /&gt;I ve try this yesterday, now i havent errors but job is "freezing" without error message. After the first set of row processed nothing happens. You can see the screenshot that i ve uploaded.&amp;nbsp;&lt;BR /&gt;best regards</description>
      <pubDate>Thu, 14 Apr 2016 06:48:35 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Data-Quality/Tmatchgroup-Limit/m-p/2277904#M3302</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2016-04-14T06:48:35Z</dc:date>
    </item>
    <item>
      <title>Re: Tmatchgroup Limit?</title>
      <link>https://community.qlik.com/t5/Data-Quality/Tmatchgroup-Limit/m-p/2277905#M3303</link>
      <description>Hi, 
&lt;BR /&gt;Are you using a blocking key in the configuration of the component? 
&lt;BR /&gt;If you don't, you' retrying to do 500 000 x 500 000 comparisons. This won't fit in memory and even using the store-on-disk option, it will take days to complete... 
&lt;BR /&gt;You must use a blocking key (probably by generating it with the tGenKey component). Have a look at examples at 
&lt;A href="https://help.talend.com/search/all?query=tMatchGroup&amp;amp;content-lang=en" rel="nofollow noopener noreferrer"&gt;https://help.talend.com/search/all?query=tMatchGroup&amp;amp;content-lang=en&lt;/A&gt; 
&lt;BR /&gt;The blocking key will partition the data so that the number of comparisons is greatly decreased. 
&lt;BR /&gt;See also this documentation 
&lt;A href="https://help.talend.com/search/all?query=tGenKey&amp;amp;content-lang=en" rel="nofollow noopener noreferrer"&gt;https://help.talend.com/search/all?query=tGenKey&amp;amp;content-lang=en&lt;/A&gt; about how to tune your tGenKey configuration for a good performance. It's advised to build blocks (aka partitions) of a few tens or hundreds of line. Use the blocking key profile to tune your partitions. 
&lt;BR /&gt;Hope this helps.</description>
      <pubDate>Thu, 14 Apr 2016 10:51:36 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Data-Quality/Tmatchgroup-Limit/m-p/2277905#M3303</guid>
      <dc:creator>Sebastiao_Qlik</dc:creator>
      <dc:date>2016-04-14T10:51:36Z</dc:date>
    </item>
  </channel>
</rss>

