Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi,
I am not able to get to a very important concept of fuzzy/close match in QV. Not sure if it has the capability of that.
As an example here is what I am looking to do. I have 2 name fields in 2 different files. Employee Name in Employee file and Vendor Name in Vendor file. I need to run a search to see if any of the employees are acting as vendors, this can't be done using exact join as there could be a small variation in the name, usually a letter or two.
Like matching Lynne to Lynn, John Smithe to Jon Smith.
Please advise
Thanks
Sunny
Hi,
one solution to match similar names could be to calculate the levenshtein distance (Levenshtein Algorithm) and apply some treshold to it:
tabEmployees:
LOAD * FROM [https://community.qlik.com/servlet/JiveServlet/download/1029287-223516/Employee%20file.xlsx] (ooxml, embedded labels, table is Sheet1);
tabVendors:
LOAD * FROM [https://community.qlik.com/servlet/JiveServlet/download/1029287-223515/vendors.xlsx] (ooxml, embedded labels, table is Sheet1);
tabLink:
LOAD Distinct [Emp Name] Resident tabEmployees;
Join
LOAD Distinct [vendor name] Resident tabVendors;
Join
LOAD [Emp Name], [vendor name], levenshtein([Emp Name], [vendor name]) as LevDist Resident tabLink;
Right Join
LOAD Distinct LevDist Resident tabLink Where LevDist<=2;
hope this helps
regards
Marco
Hi,
pls post a sample in excel.
Please go through below link,
Useful might be here Please help me get the string pattern of each field. the answers from Marco.
- Marcus
attached
attached sample files for reference
hi,
check the attachment.
Steps:
Try following below steps:
1) LOAD the tables.
2) Create a Straight table, with empnamae and vendorname as dimensions.
3) use below expression
=WildMatch([Emp Name],mid([vendor name],1,1)&'*'& mid([vendor name],2,1)&'*' & mid([vendor name],3,1)&'*')
Regards,
Joshua.
This may help.
Rob Wunderlich's cookbook has an example doing a fuzzy mapping.
http://robwunderlich.com/Download.html
Thanks Joshua, it works however that example works well for a sample data of a few records. How can I do that if I have thousands of rows to match with each other.
Hi,
Hope it will work fine for 1000 records also,
if possible you share me 1000 records sample,i will try and let you know.
Regards,
Joshua.