Algorithm for matching duplicate names

sanjeev1 — Sat, 16 Nov 2024 02:46:32 GMT

I need to identify name duplicates containing friendly and official names in a dataset. For e.g -

1) William Stark (official name)

Bill Stark (Friendly Name)

2) Bradley Thomas (Official Name)

Brad Thomas (Friendly Name)

3) Robert Gordon (Official Name)

Bob Gordon (Friendly Name)

I was looking at Jaro, Jaro-Winkler and Soundex algorithms, but I'm wondering whether there are better methods. Appreciate any guidance/ best practises you can provide.

topic Algorithm for matching duplicate names in Data Quality

Algorithm for matching duplicate names