<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Validate file for duplicate records in Talend Studio</title>
    <link>https://community.qlik.com/t5/Talend-Studio/Validate-file-for-duplicate-records/m-p/2253946#M37108</link>
    <description>&lt;P&gt;so the scenario is if you don't have duplicates process the file forward, if tuniq gives you duplicate report it.&lt;/P&gt;</description>
    <pubDate>Fri, 05 May 2017 15:26:55 GMT</pubDate>
    <dc:creator>Anonymous</dc:creator>
    <dc:date>2017-05-05T15:26:55Z</dc:date>
    <item>
      <title>Validate file for duplicate records</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Validate-file-for-duplicate-records/m-p/2253939#M37101</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I have a requirement to validate a file, if the file contains duplicate records, discard the file, if file does not contain duplicate process it. Even if we have one duplicate record discard the file.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Please help.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks,&lt;/P&gt;&lt;P&gt;Pravin Sanadi&lt;/P&gt;</description>
      <pubDate>Fri, 05 May 2017 11:19:47 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Validate-file-for-duplicate-records/m-p/2253939#M37101</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2017-05-05T11:19:47Z</dc:date>
    </item>
    <item>
      <title>Re: Validate file for duplicate records</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Validate-file-for-duplicate-records/m-p/2253940#M37102</link>
      <description>&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;I think a combination of tUniqRow and tJavaRow should do the trick.&lt;/P&gt; 
&lt;P&gt;You can connect the duplicate output from tUniqRow to tJavaRow.&lt;/P&gt; 
&lt;P&gt;Also you would require a context which can act as a flag.&lt;/P&gt; 
&lt;P&gt;Finally in tJavaRow, Set the context value as true if the input has some duplicates.&lt;/P&gt; 
&lt;P&gt;E.g&lt;/P&gt; 
&lt;P&gt;&lt;SPAN class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="setDupFlag.PNG" style="width: 244px;"&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009Ltwg.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/139929i70B494EADA8F4418/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009Ltwg.png" alt="0683p000009Ltwg.png" /&gt;&lt;/span&gt;&lt;/SPAN&gt;&lt;/P&gt; 
&lt;P&gt;and in tJavaRow,&lt;/P&gt; 
&lt;P&gt;&lt;SPAN class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="tJavaRowDup.png" style="width: 298px;"&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009Lu00.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/135815i922D3A7142B8F052/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009Lu00.png" alt="0683p000009Lu00.png" /&gt;&lt;/span&gt;&lt;/SPAN&gt;&lt;/P&gt; 
&lt;P&gt;Note that in the above screen, the context - &lt;STRONG&gt;duplicateExists&lt;/STRONG&gt; is of boolean datatype.&lt;/P&gt; 
&lt;P&gt;Once this subjob completes, you can use the context in a if condition to&amp;nbsp;decide whether you want to process the file or not.&lt;/P&gt;</description>
      <pubDate>Fri, 05 May 2017 13:25:37 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Validate-file-for-duplicate-records/m-p/2253940#M37102</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2017-05-05T13:25:37Z</dc:date>
    </item>
    <item>
      <title>Re: Validate file for duplicate records</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Validate-file-for-duplicate-records/m-p/2253941#M37103</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt; 
&lt;P&gt;you can use tfilelist and add all the files in a directory and add the directory in the tfilelist, and then pass it through a tfileinputdelimited&amp;nbsp;and then tuniquerow. give a runif condition on tfileinputdelimited as u can get the data's of unique rows and then pass it through tjava giving condition on the no&lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009MA5A.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/143082iB236712184B767DA/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009MA5A.png" alt="0683p000009MA5A.png" /&gt;&lt;/span&gt;f rows u have in a table, so if any duplicate is there in a table it will not show because we have rejected the record having duplicates using tuniquerow.So the table having less no&lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009MA5A.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/143082iB236712184B767DA/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009MA5A.png" alt="0683p000009MA5A.png" /&gt;&lt;/span&gt;f rows reject it and the rest is your o/p.&lt;/P&gt;</description>
      <pubDate>Fri, 05 May 2017 13:49:05 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Validate-file-for-duplicate-records/m-p/2253941#M37103</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2017-05-05T13:49:05Z</dc:date>
    </item>
    <item>
      <title>Re: Validate file for duplicate records</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Validate-file-for-duplicate-records/m-p/2253942#M37104</link>
      <description>&lt;P&gt;Thanks, So to process the valid file I need to create another job?&lt;/P&gt;</description>
      <pubDate>Fri, 05 May 2017 13:56:11 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Validate-file-for-duplicate-records/m-p/2253942#M37104</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2017-05-05T13:56:11Z</dc:date>
    </item>
    <item>
      <title>Re: Validate file for duplicate records</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Validate-file-for-duplicate-records/m-p/2253943#M37105</link>
      <description>&lt;P&gt;u can do that in a single job...but u just need give an if condition in tfileinputdelimited&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 05 May 2017 14:06:21 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Validate-file-for-duplicate-records/m-p/2253943#M37105</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2017-05-05T14:06:21Z</dc:date>
    </item>
    <item>
      <title>Re: Validate file for duplicate records</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Validate-file-for-duplicate-records/m-p/2253944#M37106</link>
      <description>&lt;P&gt;&lt;SPAN class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="job.JPG" style="width: 761px;"&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009Lrgu.jpg"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/146000i0EF1710A09E46284/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009Lrgu.jpg" alt="0683p000009Lrgu.jpg" /&gt;&lt;/span&gt;&lt;/SPAN&gt;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;In tjavarow i have written this code&lt;/P&gt; 
&lt;P&gt;context.ISVALID=true;&lt;BR /&gt;if(input_row.MATERIAL!=null)&lt;BR /&gt;context.ISVALID=false;&lt;/P&gt; 
&lt;P&gt;output_row.ISVALID = context.ISVALID;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;I need to process if it is valid and discard if not. how to do it?&lt;/P&gt;</description>
      <pubDate>Fri, 05 May 2017 14:16:19 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Validate-file-for-duplicate-records/m-p/2253944#M37106</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2017-05-05T14:16:19Z</dc:date>
    </item>
    <item>
      <title>Re: Validate file for duplicate records</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Validate-file-for-duplicate-records/m-p/2253945#M37107</link>
      <description>&lt;P&gt;what are the keys u give in tuniquerow, there are no records passing from tuniquerow!!&lt;/P&gt;</description>
      <pubDate>Fri, 05 May 2017 14:35:47 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Validate-file-for-duplicate-records/m-p/2253945#M37107</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2017-05-05T14:35:47Z</dc:date>
    </item>
    <item>
      <title>Re: Validate file for duplicate records</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Validate-file-for-duplicate-records/m-p/2253946#M37108</link>
      <description>&lt;P&gt;so the scenario is if you don't have duplicates process the file forward, if tuniq gives you duplicate report it.&lt;/P&gt;</description>
      <pubDate>Fri, 05 May 2017 15:26:55 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Validate-file-for-duplicate-records/m-p/2253946#M37108</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2017-05-05T15:26:55Z</dc:date>
    </item>
    <item>
      <title>Re: Validate file for duplicate records</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Validate-file-for-duplicate-records/m-p/2253947#M37109</link>
      <description>&lt;P&gt;from tJavaRow, you can use a runIf condition and join it with another subjob which will process the file.&lt;/P&gt; 
&lt;P&gt;&lt;SPAN class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="ignoreDups.PNG" style="width: 400px;"&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009Ltx0.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/148708i604F74ED5F4CBFC4/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009Ltx0.png" alt="0683p000009Ltx0.png" /&gt;&lt;/span&gt;&lt;/SPAN&gt;&lt;/P&gt; 
&lt;P&gt;Here in if (order 1) you can mention just &lt;STRONG&gt;context.ISVALID&lt;/STRONG&gt; as your condition which means process the file since it contains no duplicates. In If (order 2) you can mention &lt;STRONG&gt;!context.ISVALID&amp;nbsp;&lt;/STRONG&gt;which means the file has duplicates.&lt;/P&gt; 
&lt;P&gt;It's a simple solution but the problem is you will have to read the file again. If your input file size isn't too large, then this solution should work just fine. Otherwise&amp;nbsp;the job run time will increase since we will be reading a huge&amp;nbsp;file twice.&lt;/P&gt; 
&lt;P&gt;If I can think of something better I will post it here.&lt;/P&gt;</description>
      <pubDate>Fri, 05 May 2017 16:29:41 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Validate-file-for-duplicate-records/m-p/2253947#M37109</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2017-05-05T16:29:41Z</dc:date>
    </item>
    <item>
      <title>Re: Validate file for duplicate records</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Validate-file-for-duplicate-records/m-p/2253948#M37110</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt; 
&lt;P&gt;I think the simple job is:&lt;/P&gt; 
&lt;P&gt;&lt;SPAN class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Capture.PNG" style="width: 400px;"&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009Lu5q.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/140432i1DE92971E97802D6/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009Lu5q.png" alt="0683p000009Lu5q.png" /&gt;&lt;/span&gt;&lt;/SPAN&gt;&lt;/P&gt; 
&lt;P&gt;The "if" after tUniqRow is based on the value of the global variable tUniqRow_1_NB_DUPLICATES automagically associated to the tUniqRow_1 component (thank's to TDI):&lt;/P&gt; 
&lt;PRE&gt;((Integer)globalMap.get("tUniqRow_1_NB_DUPLICATES")) == 0&lt;/PRE&gt; 
&lt;P&gt;No tJavaRow required for this use case.&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;Hope this helps,&lt;/P&gt;</description>
      <pubDate>Sat, 06 May 2017 22:24:02 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Validate-file-for-duplicate-records/m-p/2253948#M37110</guid>
      <dc:creator>TRF</dc:creator>
      <dc:date>2017-05-06T22:24:02Z</dc:date>
    </item>
    <item>
      <title>Re: Validate file for duplicate records</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Validate-file-for-duplicate-records/m-p/2253949#M37111</link>
      <description>&lt;P&gt;Yup. That's definitely a better and simpler solution.&lt;/P&gt;</description>
      <pubDate>Sun, 07 May 2017 09:27:29 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Validate-file-for-duplicate-records/m-p/2253949#M37111</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2017-05-07T09:27:29Z</dc:date>
    </item>
  </channel>
</rss>

