Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Qlik Connect 2026 Agenda Now Available: Explore Sessions
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

TFuzzyMatch using Levenshtein Method

Hi,
I wanted to understand the matching logic in scenario of multiple key attributes using Levenshtein Method with min and max distance as 0 and 5 respectively. What I want to know is : the records are categorized as duplicate on meeting even a single criteria or all the criteria.
Labels (2)
7 Replies
Anonymous
Not applicable
Author

sorry reply to wrong thread
Anonymous
Not applicable
Author

Mr.M,
If you build a compound key of multiple columns then all of them are taken into account for the match, not just individually.
I would also like to solicit more of an understanding of your data, use case and ultimate goal as to better serve your question. There are several matching components. Which one are you using as a screencap of the job with the component settings would be very useful for our progress?
Anonymous
Not applicable
Author

Hi,
We are trying to identify duplicated customers based on First Name, Last Name, Phone Number, Email, Address, Zip Code. On Phone Number and ZIP I have applied exact match and on others Levenshtein method.
Anonymous
Not applicable
Author

Also, I want to understand how does the tFuzzyMatch logic treat the missing values.
Anonymous
Not applicable
Author

Hi,
In continuation, I also want to understand if Talend fuzzymatch supports the below feature or not.
Let us say, I want to perform match on Name, Address, Email, Phone Number:-
1. What if, for some records the fields are empty. I mean the fill rate is less than 100%. In such scenario, how does Talend handles matching.
2. Can we specify multiple rules in one go like on (Name, Address, Email, Phone Number) or (Name, Email, Phone Number) or (Name, Email) or (Name, Phone Number). In the sense, if any of these 4 rules satisfy, talend should return the records as duplicate records.
Anonymous
Not applicable
Author

Hi,
I am using talend open studio version 6.1 .Is it possible to perform in-line matching using tfuzzy match component.I want to match on more than one column like on firstname,lastname,address,zip and phone number.Also is it possible to get different outputs for duplicate and unique values using this component.
Anonymous
Not applicable
Author

Hi,
I am using talend open studio version 6.1 .Is it possible to perform in-line matching using tfuzzy match component.I want to match on more than one column like on firstname,lastname,address,zip and phone number.Also is it possible to get different outputs for duplicate and unique values using this component.

For your in-line operation, could  you please  elaborate your case with an example with input and expected output values?
 

Here is a component TalendHelpCenter:tRecordMatching which  joins two tables by doing a fuzzy match on several columns using a wide variety of comparison algorithms.(define serveral keys)
Note: This component will be available in the  Palette  of  Talend Studio  on the condition that you have subscribed to one of the  Talend Platform  products.
Best regards
Sabrina