<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: [resolved] Remove all duplicate rows from flow (including original)? in Talend Studio</title>
    <link>https://community.qlik.com/t5/Talend-Studio/resolved-Remove-all-duplicate-rows-from-flow-including-original/m-p/2315887#M86491</link>
    <description>Hello 
&lt;BR /&gt; 
&lt;BLOCKQUOTE&gt; 
 &lt;TABLE border="1"&gt; 
  &lt;TBODY&gt; 
   &lt;TR&gt; 
    &lt;TD&gt;(tUniqRow only separates out the second and subsequent records with the duplicated key, letting the original record with that key through.)&lt;/TD&gt; 
   &lt;/TR&gt; 
  &lt;/TBODY&gt; 
 &lt;/TABLE&gt; 
&lt;/BLOCKQUOTE&gt; 
&lt;BR /&gt;Yes, you need use tUniqRow to get unique records and duplicate records, output them to two temp files or memory(tHashOutput) first. On the next subJob, using two tFileInputDelimited components to read records from the two temp files again or tHashInput components from memory, do a inner join and get the unmatched rows. 
&lt;BR /&gt;Best regards 
&lt;BR /&gt; 
&lt;BR /&gt; shong</description>
    <pubDate>Mon, 25 Jan 2010 03:38:07 GMT</pubDate>
    <dc:creator>Anonymous</dc:creator>
    <dc:date>2010-01-25T03:38:07Z</dc:date>
    <item>
      <title>[resolved] Remove all duplicate rows from flow (including original)?</title>
      <link>https://community.qlik.com/t5/Talend-Studio/resolved-Remove-all-duplicate-rows-from-flow-including-original/m-p/2315886#M86490</link>
      <description>I have a series of jobs which perform complex mappings that might result in duplicate keys.  I would like to separate out ALL records with a duplicated key to be manually resolved.  (tUniqRow only separates out the second and subsequent records with the duplicated key, letting the original record with that key through.)&lt;BR /&gt;Is there another way to do this other than:&lt;BR /&gt;-- write the mapped flow to a file&lt;BR /&gt;-- read it back twice, once in full and once counting the records grouped on the key&lt;BR /&gt;-- join them using the key and then filter them on the count?&lt;BR /&gt;Ideally, the tSchemaComplianceCheck would do this based on the key fields specified in the schema.&lt;BR /&gt;I'm using Integration Suite 3.2.3 with a Java project.&lt;BR /&gt;Thanks!</description>
      <pubDate>Sat, 16 Nov 2024 13:35:50 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/resolved-Remove-all-duplicate-rows-from-flow-including-original/m-p/2315886#M86490</guid>
      <dc:creator>alevy</dc:creator>
      <dc:date>2024-11-16T13:35:50Z</dc:date>
    </item>
    <item>
      <title>Re: [resolved] Remove all duplicate rows from flow (including original)?</title>
      <link>https://community.qlik.com/t5/Talend-Studio/resolved-Remove-all-duplicate-rows-from-flow-including-original/m-p/2315887#M86491</link>
      <description>Hello 
&lt;BR /&gt; 
&lt;BLOCKQUOTE&gt; 
 &lt;TABLE border="1"&gt; 
  &lt;TBODY&gt; 
   &lt;TR&gt; 
    &lt;TD&gt;(tUniqRow only separates out the second and subsequent records with the duplicated key, letting the original record with that key through.)&lt;/TD&gt; 
   &lt;/TR&gt; 
  &lt;/TBODY&gt; 
 &lt;/TABLE&gt; 
&lt;/BLOCKQUOTE&gt; 
&lt;BR /&gt;Yes, you need use tUniqRow to get unique records and duplicate records, output them to two temp files or memory(tHashOutput) first. On the next subJob, using two tFileInputDelimited components to read records from the two temp files again or tHashInput components from memory, do a inner join and get the unmatched rows. 
&lt;BR /&gt;Best regards 
&lt;BR /&gt; 
&lt;BR /&gt; shong</description>
      <pubDate>Mon, 25 Jan 2010 03:38:07 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/resolved-Remove-all-duplicate-rows-from-flow-including-original/m-p/2315887#M86491</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2010-01-25T03:38:07Z</dc:date>
    </item>
    <item>
      <title>Re: [resolved] Remove all duplicate rows from flow (including original)?</title>
      <link>https://community.qlik.com/t5/Talend-Studio/resolved-Remove-all-duplicate-rows-from-flow-including-original/m-p/2315888#M86492</link>
      <description>Hi, shong, your suggestion doesn't seem to achieve exactly what I want. It gives me at the end only the truly uniquely-keyed records but I want all the duplicately-keyed records as well i.e. two sets of data with the total number of records across the two being the same as in my original table. 
&lt;BR /&gt;Thanks</description>
      <pubDate>Mon, 25 Jan 2010 04:27:50 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/resolved-Remove-all-duplicate-rows-from-flow-including-original/m-p/2315888#M86492</guid>
      <dc:creator>alevy</dc:creator>
      <dc:date>2010-01-25T04:27:50Z</dc:date>
    </item>
    <item>
      <title>Re: [resolved] Remove all duplicate rows from flow (including original)?</title>
      <link>https://community.qlik.com/t5/Talend-Studio/resolved-Remove-all-duplicate-rows-from-flow-including-original/m-p/2315889#M86493</link>
      <description>Hello 
&lt;BR /&gt;
&lt;BLOCKQUOTE&gt;
 &lt;TABLE border="1"&gt;
  &lt;TBODY&gt;
   &lt;TR&gt;
    &lt;TD&gt;but I want all the duplicately-keyed records as well&lt;/TD&gt;
   &lt;/TR&gt;
  &lt;/TBODY&gt;
 &lt;/TABLE&gt;
&lt;/BLOCKQUOTE&gt;
&lt;BR /&gt;On the second subJob, also get the matched rows. on the third sunJob, merge the matched rows and duplicated rows(store on tHashOutput_2).
&lt;BR /&gt;Best regards
&lt;BR /&gt; shong</description>
      <pubDate>Mon, 25 Jan 2010 05:22:17 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/resolved-Remove-all-duplicate-rows-from-flow-including-original/m-p/2315889#M86493</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2010-01-25T05:22:17Z</dc:date>
    </item>
    <item>
      <title>Re: [resolved] Remove all duplicate rows from flow (including original)?</title>
      <link>https://community.qlik.com/t5/Talend-Studio/resolved-Remove-all-duplicate-rows-from-flow-including-original/m-p/2315890#M86494</link>
      <description>An efficient and clean way is to use tAggregateRow to count key column, join to input again by tMap and then filter all row have more than 1. look at attached pic. 
&lt;BR /&gt; 
&lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009MEf1.jpg"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/153381i1F1F7AD0EBD80DB8/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009MEf1.jpg" alt="0683p000009MEf1.jpg" /&gt;&lt;/span&gt; 
&lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009MElC.jpg"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/133258iE60FAE8C2203BF30/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009MElC.jpg" alt="0683p000009MElC.jpg" /&gt;&lt;/span&gt; 
&lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009MD93.jpg"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/148700i0EB01FF5B40F09A4/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009MD93.jpg" alt="0683p000009MD93.jpg" /&gt;&lt;/span&gt; 
&lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009MEXM.jpg"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/127953i3A1E28C4900A44D7/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009MEXM.jpg" alt="0683p000009MEXM.jpg" /&gt;&lt;/span&gt; 
&lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009MEis.jpg"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/135511iC57CA37BBC053D42/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009MEis.jpg" alt="0683p000009MEis.jpg" /&gt;&lt;/span&gt;</description>
      <pubDate>Tue, 20 Dec 2016 03:54:29 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/resolved-Remove-all-duplicate-rows-from-flow-including-original/m-p/2315890#M86494</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2016-12-20T03:54:29Z</dc:date>
    </item>
  </channel>
</rss>

