<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Duplicate check between files while using tFilelist in Talend Studio</title>
    <link>https://community.qlik.com/t5/Talend-Studio/Duplicate-check-between-files-while-using-tFilelist/m-p/2202720#M4314</link>
    <description>&lt;P&gt;I suppose all your input files are based on the same schema. In such a case, you can read all the input files and push the result to a single temporary file the eliminate the duplicate records before to go into MySQL.&lt;/P&gt;&lt;P&gt;The design should look like this:&lt;/P&gt;&lt;P&gt;tFileList--(iterate)--&amp;gt;tFileInputDelimited--&amp;gt;tFileOutputDelimited(with Appen option ticked)&lt;/P&gt;&lt;P&gt;|&lt;/P&gt;&lt;P&gt;+(OnSubjobOK)&lt;/P&gt;&lt;P&gt;|&lt;/P&gt;&lt;P&gt;tFileInputDelimited--&amp;gt;tUniqRow--&amp;gt;tMysqlOutput&lt;/P&gt;</description>
    <pubDate>Tue, 15 Oct 2019 15:11:06 GMT</pubDate>
    <dc:creator>TRF</dc:creator>
    <dc:date>2019-10-15T15:11:06Z</dc:date>
    <item>
      <title>Duplicate check between files while using tFilelist</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Duplicate-check-between-files-while-using-tFilelist/m-p/2202719#M4313</link>
      <description>&lt;P&gt;Hi Team,&lt;/P&gt; 
&lt;P&gt;I am trying to load files from a directory to MySql Output table&lt;BR /&gt;I used tFileList &amp;gt; tFileinputDelimited&amp;gt;tMap&amp;gt;tMySqlOutput design to iterate through the files&lt;BR /&gt;Now I want to remove duplicate data between files. ie, check the&amp;nbsp; data based on a column or combination of 2-3 columns between the files&lt;BR /&gt;For example: if month column of first file contains data &lt;STRONG&gt;NOV&lt;/STRONG&gt; and if the second file contains same month data as &lt;STRONG&gt;NOV&lt;/STRONG&gt;, job should neglect the second file to load&lt;BR /&gt;Please help me to implement this concept in my job&lt;/P&gt;</description>
      <pubDate>Sat, 16 Nov 2024 04:22:35 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Duplicate-check-between-files-while-using-tFilelist/m-p/2202719#M4313</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2024-11-16T04:22:35Z</dc:date>
    </item>
    <item>
      <title>Re: Duplicate check between files while using tFilelist</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Duplicate-check-between-files-while-using-tFilelist/m-p/2202720#M4314</link>
      <description>&lt;P&gt;I suppose all your input files are based on the same schema. In such a case, you can read all the input files and push the result to a single temporary file the eliminate the duplicate records before to go into MySQL.&lt;/P&gt;&lt;P&gt;The design should look like this:&lt;/P&gt;&lt;P&gt;tFileList--(iterate)--&amp;gt;tFileInputDelimited--&amp;gt;tFileOutputDelimited(with Appen option ticked)&lt;/P&gt;&lt;P&gt;|&lt;/P&gt;&lt;P&gt;+(OnSubjobOK)&lt;/P&gt;&lt;P&gt;|&lt;/P&gt;&lt;P&gt;tFileInputDelimited--&amp;gt;tUniqRow--&amp;gt;tMysqlOutput&lt;/P&gt;</description>
      <pubDate>Tue, 15 Oct 2019 15:11:06 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Duplicate-check-between-files-while-using-tFilelist/m-p/2202720#M4314</guid>
      <dc:creator>TRF</dc:creator>
      <dc:date>2019-10-15T15:11:06Z</dc:date>
    </item>
    <item>
      <title>Re: Duplicate check between files while using tFilelist</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Duplicate-check-between-files-while-using-tFilelist/m-p/2202721#M4315</link>
      <description>&lt;P&gt;Thanks TRF for providing the job design and concept. Can you please tell me how will I identify and remove the duplicates from the temporary file and distinguish the data is from from first file and second file to find out the correct data.&lt;/P&gt;</description>
      <pubDate>Tue, 15 Oct 2019 15:24:38 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Duplicate-check-between-files-while-using-tFilelist/m-p/2202721#M4315</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2019-10-15T15:24:38Z</dc:date>
    </item>
    <item>
      <title>Re: Duplicate check between files while using tFilelist</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Duplicate-check-between-files-while-using-tFilelist/m-p/2202722#M4316</link>
      <description>&lt;P&gt;You need a new field into the temporary file.&lt;/P&gt; 
&lt;P&gt;Change the design like this:&lt;/P&gt; 
&lt;P&gt;tFileList--(iterate)--&amp;gt;tFileInputDelimited--&amp;gt;tMap--&amp;gt;tFileOutputDelimited(with Appen option ticked)&lt;/P&gt; 
&lt;P&gt;|&lt;/P&gt; 
&lt;P&gt;+(OnSubjobOK)&lt;/P&gt; 
&lt;P&gt;|&lt;/P&gt; 
&lt;P&gt;tFileInputDelimited--&amp;gt;tUniqRow--&amp;gt;tMysqlOutput&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;In the tMap you add a field into the output flow (let say filename) and use this expression to populate this field:&lt;/P&gt; 
&lt;PRE&gt;((String)globalMap.get("tFileList_1_CURRENT_FILEPATH"))&lt;/PRE&gt; 
&lt;P&gt;Change "tFileList_1" depending on your component real name.&lt;/P&gt; 
&lt;P&gt;Is that what you expect?&lt;/P&gt;</description>
      <pubDate>Tue, 15 Oct 2019 15:52:32 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Duplicate-check-between-files-while-using-tFilelist/m-p/2202722#M4316</guid>
      <dc:creator>TRF</dc:creator>
      <dc:date>2019-10-15T15:52:32Z</dc:date>
    </item>
    <item>
      <title>Re: Duplicate check between files while using tFilelist</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Duplicate-check-between-files-while-using-tFilelist/m-p/2202723#M4317</link>
      <description>&lt;P&gt;Thanks TRF, I have tried this approach and it is working as&amp;nbsp; how the files are placed in the the directory.The order of the file in tFileList&amp;nbsp; is from the last file in the directory to the first file, right?&amp;nbsp; I mean the order of the files. Can we specify the order of file load in tFileList or using any component? Also How will I specify the filenames in tfileinputDelimited, tFileOutputDelimited in the main job and tFileinputDelimited in the subjob? using&amp;nbsp;((String)globalMap.get("tFileList_1_CURRENT_FILEPATH"))?&lt;/P&gt;</description>
      <pubDate>Wed, 16 Oct 2019 15:20:40 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Duplicate-check-between-files-while-using-tFilelist/m-p/2202723#M4317</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2019-10-16T15:20:40Z</dc:date>
    </item>
  </channel>
</rss>

