Use HASH to match misspelt names? - Qlik Community

martynlloyd · ‎2015-03-19

Hi,

I have two sets of customer name data, one is reliable, the other is not, for example

Correct name: ACE Construction and Demolition Limited

Variations in user input

ACE Construction

ACE Demolition

A.C.E. Contrction LTD

I want to be able to create a 'best-fit' matching application - I'm thinking of resequencing the strings, as in

aacccdddeeeiiiiillmmnnnnoooorsttttu

ACE Construction would then have a match coefficient of 15 out of 35; removing Limited and LTD etc would give a match of 15/28

or 54%.

Any ideas?

Best regards,

Marty.

shane_spencer · ‎2015-03-19

This Document sprang to mind: http://community.qlik.com/docs/DOC-7051 it's not exactly the same but it seems to do a similar thing.

rbecher · ‎2015-03-19

Hi Martyn,

you can try Levenshtein distance algorithm:

- Ralf

Astrato.io Head of R&D