<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Comparing more than one column from two different files in tmap in Talend Studio</title>
    <link>https://community.qlik.com/t5/Talend-Studio/Comparing-more-than-one-column-from-two-different-files-in-tmap/m-p/2231671#M21895</link>
    <description>&lt;P&gt;let talk about "CDC"&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;how is all look?&lt;/LI&gt;&lt;LI&gt;why 2 files?&lt;/LI&gt;&lt;LI&gt;what real source of information?&lt;/LI&gt;&lt;LI&gt;what size of files?&lt;/LI&gt;&lt;LI&gt;for any CDC you must have primary key, do you have them in files?&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Thu, 31 Jan 2019 05:31:26 GMT</pubDate>
    <dc:creator>vapukov</dc:creator>
    <dc:date>2019-01-31T05:31:26Z</dc:date>
    <item>
      <title>Comparing more than one column from two different files in tmap</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Comparing-more-than-one-column-from-two-different-files-in-tmap/m-p/2231667#M21891</link>
      <description>&lt;P&gt;I/p:Two text files having 5 columns each&lt;/P&gt;
&lt;P&gt;Schema:Both files having same schema&lt;/P&gt;
&lt;P&gt;O/p: Need to compare all 5 columns for each row and assign a flag if any of the column is not matching.&lt;/P&gt;
&lt;P&gt;Approach taken:&lt;/P&gt;
&lt;P&gt;Using t map on a key and trying to check each column using .equals() method and assigning values to a flag based on outcome of comparison.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Issue:&lt;/P&gt;
&lt;P&gt;Not able to write the statement to implement the approach.&lt;/P&gt;
&lt;P&gt;Ask:&lt;/P&gt;
&lt;P&gt;Need pointers to write the statement or suggestion if i should take another approach.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 30 Jan 2019 21:16:37 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Comparing-more-than-one-column-from-two-different-files-in-tmap/m-p/2231667#M21891</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2019-01-30T21:16:37Z</dc:date>
    </item>
    <item>
      <title>Re: Comparing more than one column from two different files in tmap</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Comparing-more-than-one-column-from-two-different-files-in-tmap/m-p/2231668#M21892</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp; &amp;nbsp; If my understanding about your use case is correct, you are trying to match the records from two files having same schema and you want to pick the matched and non matched records between main flow and lookup flow.&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp;If my assumption is right, then the most easy way to do the matching is to add an inner join between the columns of main and lookup flow. If there are matched records (where matching is done on all 5 columns), then you can send the data to output flow and you can add the flag value as "Y". You can separate all non-matched records of inner join by creating another output flow and marking the inner join reject as true for that flow. Here you can add the flag value as "N"&lt;/P&gt; 
&lt;P&gt;&lt;SPAN class="lia-inline-image-display-wrapper lia-image-align-center" image-alt="Flag value set as true for inner join reject" style="width: 872px;"&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009M26p.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/136239i954E4C7768262C24/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009M26p.png" alt="0683p000009M26p.png" /&gt;&lt;/span&gt;&lt;SPAN class="lia-inline-image-caption" onclick="event.preventDefault();"&gt;Flag value set as true for inner join reject&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; You can store the output to separate file or tHashOutput for later process. If you want to merge the output in separate Sub job by reading the output data from both output flows and merge them using tUnite component.&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;Warm Regards,&lt;BR /&gt;Nikhil Thampi&lt;/P&gt; 
&lt;P&gt;Please appreciate our Talend community members by giving Kudos for sharing their time for your query. If your query is answered, please mark the topic as resolved &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 31 Jan 2019 01:58:31 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Comparing-more-than-one-column-from-two-different-files-in-tmap/m-p/2231668#M21892</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2019-01-31T01:58:31Z</dc:date>
    </item>
    <item>
      <title>Re: Comparing more than one column from two different files in tmap</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Comparing-more-than-one-column-from-two-different-files-in-tmap/m-p/2231669#M21893</link>
      <description>&lt;P&gt;Hi Nikhil,&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;think it needs&amp;nbsp;little more complicated logic, because of inner join lookup (catch rejected):&lt;/P&gt; 
&lt;UL&gt; 
 &lt;LI&gt;catch rejected rows only from Main flow (not from lookup)&lt;/LI&gt; 
 &lt;LI&gt;what if a different number&amp;nbsp;of rows? for example, lookup file contains more rows than Main&lt;/LI&gt; 
&lt;/UL&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;so, first need to clarify the goal and environment and after think -how to better approach the goal &lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009MACn.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/154443iC5B8CACEF3D12C6A/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009MACn.png" alt="0683p000009MACn.png" /&gt;&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 31 Jan 2019 02:18:30 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Comparing-more-than-one-column-from-two-different-files-in-tmap/m-p/2231669#M21893</guid>
      <dc:creator>vapukov</dc:creator>
      <dc:date>2019-01-31T02:18:30Z</dc:date>
    </item>
    <item>
      <title>Re: Comparing more than one column from two different files in tmap</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Comparing-more-than-one-column-from-two-different-files-in-tmap/m-p/2231670#M21894</link>
      <description>&lt;P&gt;Thanks Vapukov,&lt;/P&gt; 
&lt;P&gt;Intention is to implement CDC.I have open studio version which doesn't have&amp;nbsp; CDC components.&lt;/P&gt; 
&lt;P&gt;Moreover CDC need to be implemented on files ,No DB involved.&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;Trying to used to identify unmatched records from both files in a single job.&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;I could implement logic to get the unmatched from one file using option provided by Nikhil, but not from both i/p files.&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;Also by creating two sub jobs i could get unmatched from both file, but&amp;nbsp; to check if there is any update on columns except key field what should i do?&lt;/P&gt; 
&lt;P&gt;I am thinking to write a variable in tmap and compare each column by using some function(I don't know which function i should use here to compare).&lt;/P&gt;</description>
      <pubDate>Thu, 31 Jan 2019 05:26:37 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Comparing-more-than-one-column-from-two-different-files-in-tmap/m-p/2231670#M21894</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2019-01-31T05:26:37Z</dc:date>
    </item>
    <item>
      <title>Re: Comparing more than one column from two different files in tmap</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Comparing-more-than-one-column-from-two-different-files-in-tmap/m-p/2231671#M21895</link>
      <description>&lt;P&gt;let talk about "CDC"&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;how is all look?&lt;/LI&gt;&lt;LI&gt;why 2 files?&lt;/LI&gt;&lt;LI&gt;what real source of information?&lt;/LI&gt;&lt;LI&gt;what size of files?&lt;/LI&gt;&lt;LI&gt;for any CDC you must have primary key, do you have them in files?&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 31 Jan 2019 05:31:26 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Comparing-more-than-one-column-from-two-different-files-in-tmap/m-p/2231671#M21895</guid>
      <dc:creator>vapukov</dc:creator>
      <dc:date>2019-01-31T05:31:26Z</dc:date>
    </item>
    <item>
      <title>Re: Comparing more than one column from two different files in tmap</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Comparing-more-than-one-column-from-two-different-files-in-tmap/m-p/2231672#M21896</link>
      <description>Yes I do have primary key.&lt;BR /&gt;Size is less only.&lt;BR /&gt;Files created with some sample data and trying to implement CDC.&lt;BR /&gt;&lt;BR /&gt;</description>
      <pubDate>Thu, 31 Jan 2019 06:00:41 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Comparing-more-than-one-column-from-two-different-files-in-tmap/m-p/2231672#M21896</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2019-01-31T06:00:41Z</dc:date>
    </item>
    <item>
      <title>Re: Comparing more than one column from two different files in tmap</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Comparing-more-than-one-column-from-two-different-files-in-tmap/m-p/2231673#M21897</link>
      <description>&lt;P&gt;&lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;for what?!!!!!&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;seriously - without understanding full picture, not possible to make a proper solution!&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;is your program (application, service and etc) - work with plain csv files?&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;example1 - pure CDC:&lt;/P&gt; 
&lt;P&gt;- MySQL, PostgreSQL - proper CDC technologies already available on market for free&lt;/P&gt; 
&lt;P&gt;- SQL Server, Oracle and etc - in enterprise versions, but could be easily realized by triggers&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;catch transactions (filtered by table&amp;nbsp;if need), send to Kafka, parse by many subscribers&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;example 2:&lt;/P&gt; 
&lt;P&gt;load both files in any database ( including local SQLite) and all CDC between 2 files it is just a 3 SQL queries&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;example 3:&lt;/P&gt; 
&lt;P&gt;use JDBC for csv - again 3 simple queries&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;example 4:&lt;/P&gt; 
&lt;P&gt;pure Talend - 3 flows:&lt;/P&gt; 
&lt;P&gt;- find id from 1st files which is not 2nd - for INSERT - 1st file in main, 2nd is lookup, INNER JOIN catch rejected&lt;/P&gt; 
&lt;P&gt;- find id in 2nd but not in 1st - for DELETE- 2st file in main, 1nd is lookup,&amp;nbsp;INNER JOIN catch rejected&lt;/P&gt; 
&lt;P&gt;- INNER JOIN in tMAP by id and filter where&lt;/P&gt; 
&lt;PRE&gt;(row1.col1 != row2.col1 ||&amp;nbsp;row1.col2 != row2.col2 ||&amp;nbsp;row1.col3 != row2.col3 ||&amp;nbsp;row1.col4 != row2.col4 ||&amp;nbsp;row1.col5 != row2.col5)&lt;/PRE&gt; 
&lt;P&gt;instead of != possible to use .equals() for string and etc&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 31 Jan 2019 06:21:45 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Comparing-more-than-one-column-from-two-different-files-in-tmap/m-p/2231673#M21897</guid>
      <dc:creator>vapukov</dc:creator>
      <dc:date>2019-01-31T06:21:45Z</dc:date>
    </item>
  </channel>
</rss>

