Skip to main content
Woohoo! Qlik Community has won “Best in Class Community” in the 2024 Khoros Kudos awards!
Announcements
Nov. 20th, Qlik Insider - Lakehouses: Driving the Future of Data & AI - PICK A SESSION
cancel
Showing results for 
Search instead for 
Did you mean: 
martynlloyd
Partner - Creator III
Partner - Creator III

Use HASH to match misspelt names?

Hi,

I have two sets of customer name data, one is reliable, the other is not, for example

Correct name: ACE Construction and Demolition Limited

Variations in user input

ACE Construction

ACE Demolition

A.C.E. Contrction LTD

I want to be able to create a 'best-fit' matching application - I'm thinking of resequencing the strings, as in

aacccdddeeeiiiiillmmnnnnoooorsttttu

ACE Construction would then have a match coefficient of 15 out of 35; removing Limited and LTD etc would give a match of 15/28

or 54%.

Any ideas?

Best regards,

Marty.

2 Replies
shane_spencer
Specialist
Specialist

This Document sprang to mind: http://community.qlik.com/docs/DOC-7051 it's not exactly the same but it seems to do a similar thing.

rbecher
MVP
MVP

Hi Martyn,

you can try Levenshtein distance algorithm:

http://community.qlik.com/message/517405#517405

- Ralf

Astrato.io Head of R&D