Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi,
I'm looking for a possibility for string matching with fuzzy(-search), trigram (n-gram), levenshtein, etc. in QV script.
Any suggestions?
Ralf
I just copied and tested the code, not much work, though.
Please join the group Data Quality
- Ralf
So, you've found the same VBScript function.
I think trigram comparison makes more sense to score and find duplicates. Maybe I post a solution later..
Hi Ralf,
do you already have a solution with trigram comparison or something else?
I try to compare about 5000 address data.
The Levenshtein solution works but it takes too long.
Thanks for your help!
Regards
Dominik
Hi Dominik,
yes I have, especially for this use case finding doublets in address data..
- Ralf
Hey Ralph,
I`m very interested in your solution of the trigram comparison.
I have a huge dataset with firstName and lastName and i want to make a similarity check to find doublets.
May I ask you to share your trigram solution?
Steve
Hi Steve,
unfortunately I can't because it's a commercial solution..
- Ralf