<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: unique values extraction in Talend Studio</title>
    <link>https://community.qlik.com/t5/Talend-Studio/unique-values-extraction/m-p/2237582#M25972</link>
    <description>&lt;P&gt;In addition to all the valid inferences from Nikhil, another approach i would suggest is:&lt;/P&gt; 
&lt;P&gt;1) Sort your data based on the column you want to check on (In your example Name)&lt;/P&gt; 
&lt;P&gt;2) Check the subsequent rows of data if they match the column value, if they match then ignore else carry forward. With this approach you would only process the first row of data for that column. However you are have to double ensure that your sort criteria is correct and valid&lt;/P&gt; 
&lt;P&gt;Eg.&amp;nbsp; &amp;nbsp;&lt;STRONG&gt;Input Data&lt;/STRONG&gt;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;SNo.&amp;nbsp; &amp;nbsp; &amp;nbsp;Name&amp;nbsp; &amp;nbsp; &amp;nbsp; Age&lt;/P&gt; 
&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; 100&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Mohan&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;23&lt;/P&gt; 
&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; 101&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Mike&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 45&lt;/P&gt; 
&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; 102&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Mohan&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;32&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&lt;STRONG&gt;Sorted Data&lt;/STRONG&gt;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; SNo.&amp;nbsp; &amp;nbsp; &amp;nbsp;Name&amp;nbsp; &amp;nbsp; &amp;nbsp; Age&lt;/P&gt; 
&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; 100&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Mohan&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;23&lt;/P&gt; 
&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; 102&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Mohan&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;32&lt;/P&gt; 
&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; 101&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Mike&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 45&lt;/P&gt; 
&lt;P&gt;&lt;STRONG&gt;Process Detail&lt;/STRONG&gt;&lt;/P&gt; 
&lt;P&gt;when the current row is Sno=100 you have nothing to compare so that will go through, when the current row is 102 you find that for the previous record Name= the name of the current record, so you can ignore that record, when you get to the next record Name=Mike which is different from Mohan so that will go through&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;This approach would greatly help in performance as well, because if the incoming data volume is huge, aggregator will quickly turn out to be performance hindrance. This is a age old technique where you compare rows on the fly... Hope that helps&lt;/P&gt;</description>
    <pubDate>Fri, 03 Jan 2020 15:27:17 GMT</pubDate>
    <dc:creator>tnewbie</dc:creator>
    <dc:date>2020-01-03T15:27:17Z</dc:date>
    <item>
      <title>unique values extraction</title>
      <link>https://community.qlik.com/t5/Talend-Studio/unique-values-extraction/m-p/2237580#M25970</link>
      <description>&lt;P&gt;Hi All,&lt;/P&gt; 
&lt;P&gt;need a help regarding extraction of unique values of a multiple columns individually in a table by using Talend/SQL.&lt;/P&gt; 
&lt;P&gt;EXAMPLE:&lt;/P&gt; 
&lt;P&gt;INPUT:&lt;/P&gt; 
&lt;TABLE width="279"&gt; 
 &lt;TBODY&gt; 
  &lt;TR&gt; 
   &lt;TD width="64"&gt;ID&lt;/TD&gt; 
   &lt;TD width="87"&gt;NAME&lt;/TD&gt; 
   &lt;TD width="64"&gt;SALARY&lt;/TD&gt; 
   &lt;TD width="64"&gt;AGE&lt;/TD&gt; 
  &lt;/TR&gt; 
  &lt;TR&gt; 
   &lt;TD&gt;1&lt;/TD&gt; 
   &lt;TD&gt;MOHAN&lt;/TD&gt; 
   &lt;TD&gt;5000&lt;/TD&gt; 
   &lt;TD&gt;27&lt;/TD&gt; 
  &lt;/TR&gt; 
  &lt;TR&gt; 
   &lt;TD&gt;2&lt;/TD&gt; 
   &lt;TD&gt;AJAY&lt;/TD&gt; 
   &lt;TD&gt;2700&lt;/TD&gt; 
   &lt;TD&gt;26&lt;/TD&gt; 
  &lt;/TR&gt; 
  &lt;TR&gt; 
   &lt;TD&gt;3&lt;/TD&gt; 
   &lt;TD&gt;SAI&lt;/TD&gt; 
   &lt;TD&gt;5000&lt;/TD&gt; 
   &lt;TD&gt;29&lt;/TD&gt; 
  &lt;/TR&gt; 
  &lt;TR&gt; 
   &lt;TD&gt;4&lt;/TD&gt; 
   &lt;TD&gt;RAMA&lt;/TD&gt; 
   &lt;TD&gt;3000&lt;/TD&gt; 
   &lt;TD&gt;27&lt;/TD&gt; 
  &lt;/TR&gt; 
  &lt;TR&gt; 
   &lt;TD&gt;5&lt;/TD&gt; 
   &lt;TD&gt;MOHAN&lt;/TD&gt; 
   &lt;TD&gt;4500&lt;/TD&gt; 
   &lt;TD&gt;29&lt;/TD&gt; 
  &lt;/TR&gt; 
 &lt;/TBODY&gt; 
&lt;/TABLE&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;EXPECTED OUTPUT:&lt;/P&gt; 
&lt;TABLE width="279"&gt; 
 &lt;TBODY&gt; 
  &lt;TR&gt; 
   &lt;TD width="64"&gt;ID&lt;/TD&gt; 
   &lt;TD width="87"&gt;NAME&lt;/TD&gt; 
   &lt;TD width="64"&gt;SALARY&lt;/TD&gt; 
   &lt;TD width="64"&gt;AGE&lt;/TD&gt; 
  &lt;/TR&gt; 
  &lt;TR&gt; 
   &lt;TD&gt;1&lt;/TD&gt; 
   &lt;TD&gt;MOHAN&lt;/TD&gt; 
   &lt;TD&gt;5000&lt;/TD&gt; 
   &lt;TD&gt;27&lt;/TD&gt; 
  &lt;/TR&gt; 
  &lt;TR&gt; 
   &lt;TD&gt;2&lt;/TD&gt; 
   &lt;TD&gt;AJAY&lt;/TD&gt; 
   &lt;TD&gt;2700&lt;/TD&gt; 
   &lt;TD&gt;26&lt;/TD&gt; 
  &lt;/TR&gt; 
  &lt;TR&gt; 
   &lt;TD&gt;3&lt;/TD&gt; 
   &lt;TD&gt;SAI&lt;/TD&gt; 
   &lt;TD&gt;3000&lt;/TD&gt; 
   &lt;TD&gt;29&lt;/TD&gt; 
  &lt;/TR&gt; 
  &lt;TR&gt; 
   &lt;TD&gt;4&lt;/TD&gt; 
   &lt;TD&gt;RAMA&lt;/TD&gt; 
   &lt;TD&gt;4500&lt;/TD&gt; 
   &lt;TD&gt;&amp;nbsp;&lt;/TD&gt; 
  &lt;/TR&gt; 
  &lt;TR&gt; 
   &lt;TD&gt;5&lt;/TD&gt; 
   &lt;TD&gt;&amp;nbsp;&lt;/TD&gt; 
   &lt;TD&gt;&amp;nbsp;&lt;/TD&gt; 
   &lt;TD&gt;&amp;nbsp;&lt;/TD&gt; 
  &lt;/TR&gt; 
 &lt;/TBODY&gt; 
&lt;/TABLE&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;Thanks in advance.&lt;/P&gt; 
&lt;P&gt;regards;&lt;/P&gt; 
&lt;P&gt;Bhargav&lt;/P&gt;</description>
      <pubDate>Sat, 16 Nov 2024 03:41:09 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/unique-values-extraction/m-p/2237580#M25970</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2024-11-16T03:41:09Z</dc:date>
    </item>
    <item>
      <title>Re: unique values extraction</title>
      <link>https://community.qlik.com/t5/Talend-Studio/unique-values-extraction/m-p/2237581#M25971</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp; &amp;nbsp; You can either use tAggregaterow or tUniquerow to select the unique values where you need to pass only relevant fields as input to these components (you can use a tMap to select the right columns).&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp; &amp;nbsp; One thing I would like to say is that your input and output record do not have any correlation. For example, you were having two input records as shown below.&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;PRE&gt;MOHAN	5000	27
MOHAN   4500    29&lt;/PRE&gt; 
&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp;But you have taken the unique value as&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;TABLE width="279"&gt; 
 &lt;TBODY&gt; 
  &lt;TR&gt; 
   &lt;TD&gt;MOHAN&lt;/TD&gt; 
   &lt;TD&gt;5000&lt;/TD&gt; 
   &lt;TD&gt;27&lt;/TD&gt; 
  &lt;/TR&gt; 
 &lt;/TBODY&gt; 
&lt;/TABLE&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp;You might be joining records of two persons with name Mohan with different age to a single output record. Since the output record details are a combination of the earlier two records it may result in a totally new imaginary person.&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp;So please double check how you would like to aggregate the input records.&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;Warm Regards,&lt;BR /&gt;Nikhil Thampi&lt;/P&gt; 
&lt;P&gt;Please appreciate our Talend community members by giving Kudos for sharing their time for your query. If your query is answered, please mark the topic as resolved&lt;/P&gt;</description>
      <pubDate>Fri, 03 Jan 2020 14:51:13 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/unique-values-extraction/m-p/2237581#M25971</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2020-01-03T14:51:13Z</dc:date>
    </item>
    <item>
      <title>Re: unique values extraction</title>
      <link>https://community.qlik.com/t5/Talend-Studio/unique-values-extraction/m-p/2237582#M25972</link>
      <description>&lt;P&gt;In addition to all the valid inferences from Nikhil, another approach i would suggest is:&lt;/P&gt; 
&lt;P&gt;1) Sort your data based on the column you want to check on (In your example Name)&lt;/P&gt; 
&lt;P&gt;2) Check the subsequent rows of data if they match the column value, if they match then ignore else carry forward. With this approach you would only process the first row of data for that column. However you are have to double ensure that your sort criteria is correct and valid&lt;/P&gt; 
&lt;P&gt;Eg.&amp;nbsp; &amp;nbsp;&lt;STRONG&gt;Input Data&lt;/STRONG&gt;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;SNo.&amp;nbsp; &amp;nbsp; &amp;nbsp;Name&amp;nbsp; &amp;nbsp; &amp;nbsp; Age&lt;/P&gt; 
&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; 100&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Mohan&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;23&lt;/P&gt; 
&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; 101&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Mike&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 45&lt;/P&gt; 
&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; 102&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Mohan&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;32&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&lt;STRONG&gt;Sorted Data&lt;/STRONG&gt;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; SNo.&amp;nbsp; &amp;nbsp; &amp;nbsp;Name&amp;nbsp; &amp;nbsp; &amp;nbsp; Age&lt;/P&gt; 
&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; 100&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Mohan&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;23&lt;/P&gt; 
&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; 102&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Mohan&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;32&lt;/P&gt; 
&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; 101&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Mike&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 45&lt;/P&gt; 
&lt;P&gt;&lt;STRONG&gt;Process Detail&lt;/STRONG&gt;&lt;/P&gt; 
&lt;P&gt;when the current row is Sno=100 you have nothing to compare so that will go through, when the current row is 102 you find that for the previous record Name= the name of the current record, so you can ignore that record, when you get to the next record Name=Mike which is different from Mohan so that will go through&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;This approach would greatly help in performance as well, because if the incoming data volume is huge, aggregator will quickly turn out to be performance hindrance. This is a age old technique where you compare rows on the fly... Hope that helps&lt;/P&gt;</description>
      <pubDate>Fri, 03 Jan 2020 15:27:17 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/unique-values-extraction/m-p/2237582#M25972</guid>
      <dc:creator>tnewbie</dc:creator>
      <dc:date>2020-01-03T15:27:17Z</dc:date>
    </item>
  </channel>
</rss>

