Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi,
I am not able to get to a very important concept of fuzzy/close match in QV. Not sure if it has the capability of that.
As an example here is what I am looking to do. I have 2 name fields in 2 different files. Employee Name in Employee file and Vendor Name in Vendor file. I need to run a search to see if any of the employees are acting as vendors, this can't be done using exact join as there could be a small variation in the name, usually a letter or two.
Like matching Lynne to Lynn, John Smithe to Jon Smith.
Please advise
Thanks
Sunny
Hi Josh, I am getting the following msg at that link and when I go to qlikviewcookbook.com, I see a lot of enteries but not the one we want
All downloads are now served from the Tools and Recipes sections at:
Go there to access downloads and updated content
This directory does not yet contain any files.
Hello Sunny,
You can find that mapping example in the Recipe section. Direct link to download the example: http://qlikviewcookbook.com/download/mapping-substrings-2/
Anyway, I have the same need as yours but this example doesn't help me. I think it's easier when the fuzzy search uses single words but when the string to be search is a sentence -like a name and a lastname- there isn't a straightforward solution...
But if you find it, please share!
Regards,
Ariel
Hi,
one solution to match similar names could be to calculate the levenshtein distance (Levenshtein Algorithm) and apply some treshold to it:
tabEmployees:
LOAD * FROM [https://community.qlik.com/servlet/JiveServlet/download/1029287-223516/Employee%20file.xlsx] (ooxml, embedded labels, table is Sheet1);
tabVendors:
LOAD * FROM [https://community.qlik.com/servlet/JiveServlet/download/1029287-223515/vendors.xlsx] (ooxml, embedded labels, table is Sheet1);
tabLink:
LOAD Distinct [Emp Name] Resident tabEmployees;
Join
LOAD Distinct [vendor name] Resident tabVendors;
Join
LOAD [Emp Name], [vendor name], levenshtein([Emp Name], [vendor name]) as LevDist Resident tabLink;
Right Join
LOAD Distinct LevDist Resident tabLink Where LevDist<=2;
hope this helps
regards
Marco
Thanks Marco,
How are you getting the function levenshtein in the script? I tried but this function does not exist.
Hi,
levenshtein is a User Defined Function in the Macro Editor. Click ( Ctrl + M ), you can find it..
Thanks settu, can you please attach the qvw, I don't have that
I just showed it from marcowedelattachment. You can download it.. Or You can just the mentioned link.
Thanks I missed that
Thanks Marco
Hi Marco,
I just added a few non matching exceptions also in the file and those came too in the table box. Any way we could avoid that to appear in the table box.
Emp no | Emp Name | vendor name | vendor num |
12 | dell | de | 2 |
999 | genpact | ||
111 | John Smithe | Jon Smith | 88888 |
122 | Lynne | Lynn | 98878 |
233 | Mike Pepper | Mike Peper | 99876 |
gen | 1 | ||
o | 3 | ||
p | 4 | ||
q | 5 | ||
r | 6 | ||
s | 7 |