Skip to main content
Announcements
NEW: Seamless Public Data Sharing with Qlik's New Anonymous Access Capability: TELL ME MORE!
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

Fuzzy/Close Name Match

Hi,

I am not able to get to a very important concept of fuzzy/close match in QV. Not sure if it has the capability of that.

As an example here is what I am looking to do. I have 2 name fields in 2 different files. Employee Name in Employee file and Vendor Name in Vendor file. I need to run a search to see if any of the employees are acting as vendors, this can't be done using exact join as there could be a small variation in the name, usually a letter or two.

Like matching Lynne to Lynn, John Smithe to Jon Smith.

Please advise

Thanks

Sunny

1 Solution

Accepted Solutions
MarcoWedel

Hi,

one solution to match similar names could be to calculate the levenshtein distance (Levenshtein Algorithm) and apply some treshold to it:

QlikCommunity_Thread_214856_Pic1.JPG

tabEmployees:

LOAD * FROM [https://community.qlik.com/servlet/JiveServlet/download/1029287-223516/Employee%20file.xlsx] (ooxml, embedded labels, table is Sheet1);

tabVendors:

LOAD * FROM [https://community.qlik.com/servlet/JiveServlet/download/1029287-223515/vendors.xlsx] (ooxml, embedded labels, table is Sheet1);

tabLink:

LOAD Distinct [Emp Name] Resident tabEmployees;

Join

LOAD Distinct [vendor name] Resident tabVendors;

Join

LOAD [Emp Name], [vendor name], levenshtein([Emp Name], [vendor name]) as LevDist Resident tabLink;

Right Join

LOAD Distinct LevDist Resident tabLink Where LevDist<=2;

hope this helps

regards

Marco

View solution in original post

23 Replies
ajsjoshua
Specialist
Specialist

Hi,

pls post a sample in excel.

Not applicable
Author

Please go through below link,

Re: Fuzzy Search in Scripts

marcus_sommer

Useful might be here Please help me get the string pattern of each field. the answers from Marco.

- Marcus

Anonymous
Not applicable
Author

attached

Anonymous
Not applicable
Author

attached sample files for reference

ajsjoshua
Specialist
Specialist

hi,

check the attachment.

Steps:

Try following below steps:

1) LOAD the tables.

2) Create a Straight table, with empnamae and vendorname as dimensions.

3) use below expression

=WildMatch([Emp Name],mid([vendor name],1,1)&'*'& mid([vendor name],2,1)&'*' & mid([vendor name],3,1)&'*')

Regards,

Joshua.

jonathandienst
Partner - Champion III
Partner - Champion III

This may help.

Rob Wunderlich's cookbook has an example doing a fuzzy mapping.

http://robwunderlich.com/Download.html

Logic will get you from a to b. Imagination will take you everywhere. - A Einstein
Anonymous
Not applicable
Author

Thanks Joshua, it works however that example works well for a sample data of a few records. How can I do that if I have thousands of rows to match with each other.

ajsjoshua
Specialist
Specialist

Hi,

Hope it will work fine for 1000 records also,

if possible you share me 1000 records sample,i will try and let you know.

Regards,

Joshua.