<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: TFuzzyMatch using Levenshtein Method in Talend Data Catalog</title>
    <link>https://community.qlik.com/t5/Talend-Data-Catalog/TFuzzyMatch-using-Levenshtein-Method/m-p/2330679#M919</link>
    <description>Hi, 
&lt;BR /&gt;We are trying to identify duplicated customers based on First Name, Last Name, Phone Number, Email, Address, Zip Code. On Phone Number and ZIP I have applied exact match and on others Levenshtein method.</description>
    <pubDate>Wed, 19 Mar 2014 09:16:09 GMT</pubDate>
    <dc:creator>Anonymous</dc:creator>
    <dc:date>2014-03-19T09:16:09Z</dc:date>
    <item>
      <title>TFuzzyMatch using Levenshtein Method</title>
      <link>https://community.qlik.com/t5/Talend-Data-Catalog/TFuzzyMatch-using-Levenshtein-Method/m-p/2330676#M916</link>
      <description>Hi,&lt;BR /&gt;I wanted to understand the matching logic in scenario of multiple key attributes using Levenshtein Method with min and max distance as 0 and 5 respectively. What I want to know is : the records are categorized as duplicate on meeting even a single criteria or all the criteria.</description>
      <pubDate>Tue, 18 Mar 2014 14:37:08 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Data-Catalog/TFuzzyMatch-using-Levenshtein-Method/m-p/2330676#M916</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2014-03-18T14:37:08Z</dc:date>
    </item>
    <item>
      <title>Re: TFuzzyMatch using Levenshtein Method</title>
      <link>https://community.qlik.com/t5/Talend-Data-Catalog/TFuzzyMatch-using-Levenshtein-Method/m-p/2330677#M917</link>
      <description>sorry reply to wrong thread</description>
      <pubDate>Tue, 18 Mar 2014 15:30:01 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Data-Catalog/TFuzzyMatch-using-Levenshtein-Method/m-p/2330677#M917</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2014-03-18T15:30:01Z</dc:date>
    </item>
    <item>
      <title>Re: TFuzzyMatch using Levenshtein Method</title>
      <link>https://community.qlik.com/t5/Talend-Data-Catalog/TFuzzyMatch-using-Levenshtein-Method/m-p/2330678#M918</link>
      <description>Mr.M, 
&lt;BR /&gt;If you build a compound key of multiple columns then all of them are taken into account for the match, not just individually. 
&lt;BR /&gt;I would also like to solicit more of an understanding of your data, use case and ultimate goal as to better serve your question. There are several matching components. Which one are you using as a screencap of the job with the component settings would be very useful for our progress?</description>
      <pubDate>Tue, 18 Mar 2014 18:06:47 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Data-Catalog/TFuzzyMatch-using-Levenshtein-Method/m-p/2330678#M918</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2014-03-18T18:06:47Z</dc:date>
    </item>
    <item>
      <title>Re: TFuzzyMatch using Levenshtein Method</title>
      <link>https://community.qlik.com/t5/Talend-Data-Catalog/TFuzzyMatch-using-Levenshtein-Method/m-p/2330679#M919</link>
      <description>Hi, 
&lt;BR /&gt;We are trying to identify duplicated customers based on First Name, Last Name, Phone Number, Email, Address, Zip Code. On Phone Number and ZIP I have applied exact match and on others Levenshtein method.</description>
      <pubDate>Wed, 19 Mar 2014 09:16:09 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Data-Catalog/TFuzzyMatch-using-Levenshtein-Method/m-p/2330679#M919</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2014-03-19T09:16:09Z</dc:date>
    </item>
    <item>
      <title>Re: TFuzzyMatch using Levenshtein Method</title>
      <link>https://community.qlik.com/t5/Talend-Data-Catalog/TFuzzyMatch-using-Levenshtein-Method/m-p/2330680#M920</link>
      <description>Also, I want to understand how does the tFuzzyMatch logic treat the missing values.</description>
      <pubDate>Wed, 19 Mar 2014 10:12:40 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Data-Catalog/TFuzzyMatch-using-Levenshtein-Method/m-p/2330680#M920</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2014-03-19T10:12:40Z</dc:date>
    </item>
    <item>
      <title>Re: TFuzzyMatch using Levenshtein Method</title>
      <link>https://community.qlik.com/t5/Talend-Data-Catalog/TFuzzyMatch-using-Levenshtein-Method/m-p/2330681#M921</link>
      <description>Hi, 
&lt;BR /&gt;In continuation, I also want to understand if Talend fuzzymatch supports the below feature or not. 
&lt;BR /&gt;Let us say, I want to perform match on Name, Address, Email, Phone Number:- 
&lt;BR /&gt;1. What if, for some records the fields are empty. I mean the fill rate is less than 100%. In such scenario, how does Talend handles matching. 
&lt;BR /&gt;2. Can we specify multiple rules in one go like on (Name, Address, Email, Phone Number) or (Name, Email, Phone Number) or (Name, Email) or (Name, Phone Number). In the sense, if any of these 4 rules satisfy, talend should return the records as duplicate records.</description>
      <pubDate>Thu, 20 Mar 2014 13:42:21 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Data-Catalog/TFuzzyMatch-using-Levenshtein-Method/m-p/2330681#M921</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2014-03-20T13:42:21Z</dc:date>
    </item>
    <item>
      <title>Re: TFuzzyMatch using Levenshtein Method</title>
      <link>https://community.qlik.com/t5/Talend-Data-Catalog/TFuzzyMatch-using-Levenshtein-Method/m-p/2330682#M922</link>
      <description>Hi, 
&lt;BR /&gt;I am using talend open studio version 6.1 .Is it possible to perform in-line matching using tfuzzy match component.I want to match on more than one column like on firstname,lastname,address,zip and phone number.Also is it possible to get different outputs for duplicate and unique values using this component.</description>
      <pubDate>Mon, 28 Mar 2016 11:24:16 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Data-Catalog/TFuzzyMatch-using-Levenshtein-Method/m-p/2330682#M922</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2016-03-28T11:24:16Z</dc:date>
    </item>
    <item>
      <title>Re: TFuzzyMatch using Levenshtein Method</title>
      <link>https://community.qlik.com/t5/Talend-Data-Catalog/TFuzzyMatch-using-Levenshtein-Method/m-p/2330683#M923</link>
      <description>Hi, 
&lt;BR /&gt; 
&lt;BLOCKQUOTE&gt; 
 &lt;TABLE border="1"&gt; 
  &lt;TBODY&gt; 
   &lt;TR&gt; 
    &lt;TD&gt;&lt;FONT size="1"&gt;&lt;FONT face="Verdana, Helvetica, Arial, sans-serif"&gt;I am using talend open studio version 6.1 .Is it possible to perform in-line matching using tfuzzy match component.I want to match on more than one column like on firstname,lastname,address,zip and phone number.Also is it possible to get different outputs for duplicate and unique values using this component.&lt;/FONT&gt;&lt;/FONT&gt;&lt;/TD&gt; 
   &lt;/TR&gt; 
  &lt;/TBODY&gt; 
 &lt;/TABLE&gt; 
&lt;/BLOCKQUOTE&gt; 
&lt;BR /&gt; 
&lt;FONT size="1"&gt;&lt;FONT face="Verdana, Helvetica, Arial, sans-serif"&gt;For your in-line operation, could &amp;nbsp;you please&amp;nbsp;&lt;/FONT&gt;&lt;/FONT&gt; 
&lt;FONT size="2"&gt;&lt;FONT face="Calibri, sans-serif"&gt;elaborate your case with an example with input and expected output values?&lt;/FONT&gt;&lt;/FONT&gt; 
&lt;BR /&gt; 
&lt;FONT size="2"&gt;&lt;FONT face="Calibri, sans-serif"&gt;&amp;nbsp;&lt;BR /&gt;&lt;/FONT&gt;&lt;/FONT&gt; 
&lt;BR /&gt; 
&lt;FONT face="Verdana, Helvetica, Arial, sans-serif"&gt;&lt;FONT size="1"&gt;Here is a component&amp;nbsp;&lt;A href="https://help.talend.com/search/all?query=tRecordMatching&amp;amp;content-lang=en" target="_blank" rel="nofollow noopener noreferrer"&gt;TalendHelpCenter:tRecordMatching&lt;/A&gt;&amp;nbsp;which&amp;nbsp;&lt;/FONT&gt;&lt;/FONT&gt; 
&lt;FONT size="2"&gt;&lt;FONT face="noto, Helvetica, Arial, sans-serif"&gt;joins two tables by doing a fuzzy match on several columns using a wide variety of comparison algorithms.(define serveral keys)&lt;/FONT&gt;&lt;/FONT&gt; 
&lt;BR /&gt; 
&lt;FONT size="2"&gt;&lt;FONT face="noto, Helvetica, Arial, sans-serif"&gt;Note:&lt;/FONT&gt;&lt;/FONT&gt; 
&lt;FONT size="2"&gt;&lt;FONT face="noto, Helvetica, Arial, sans-serif"&gt;This component will be available in the&amp;nbsp;&lt;/FONT&gt;&lt;/FONT&gt; 
&lt;FONT size="2"&gt;&lt;FONT face="noto, Helvetica, Arial, sans-serif"&gt;Palette&lt;/FONT&gt;&lt;/FONT&gt; 
&lt;FONT size="2"&gt;&lt;FONT face="noto, Helvetica, Arial, sans-serif"&gt;&amp;nbsp;of&amp;nbsp;&lt;/FONT&gt;&lt;/FONT&gt; 
&lt;FONT size="2"&gt;&lt;FONT face="noto, Helvetica, Arial, sans-serif"&gt;Talend Studio&lt;/FONT&gt;&lt;/FONT&gt; 
&lt;FONT size="2"&gt;&lt;FONT face="noto, Helvetica, Arial, sans-serif"&gt;&amp;nbsp;on the condition that you have subscribed to one of the&amp;nbsp;&lt;/FONT&gt;&lt;/FONT&gt; 
&lt;FONT size="2"&gt;&lt;FONT face="noto, Helvetica, Arial, sans-serif"&gt;Talend Platform&lt;/FONT&gt;&lt;/FONT&gt; 
&lt;FONT size="2"&gt;&lt;FONT face="noto, Helvetica, Arial, sans-serif"&gt;&amp;nbsp;products.&lt;/FONT&gt;&lt;/FONT&gt; 
&lt;BR /&gt; 
&lt;FONT size="2"&gt;&lt;FONT face="noto, Helvetica, Arial, sans-serif"&gt;Best regards&lt;/FONT&gt;&lt;/FONT&gt; 
&lt;BR /&gt; 
&lt;FONT size="2"&gt;&lt;FONT face="noto, Helvetica, Arial, sans-serif"&gt;Sabrina&lt;/FONT&gt;&lt;/FONT&gt;</description>
      <pubDate>Tue, 29 Mar 2016 09:30:50 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Data-Catalog/TFuzzyMatch-using-Levenshtein-Method/m-p/2330683#M923</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2016-03-29T09:30:50Z</dc:date>
    </item>
  </channel>
</rss>

