Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
I want to join two tables with jumbled words of name
Table A
Name | ID |
DES GUZMANSHIP SALASA, Danis Marc | 1 |
SHAH GOPAL, saumya | 2 |
JACOB KING, Stan | 3 |
Table B
Name | Country |
Danis Marc Guzmanship Des Salasa | France |
Saumya Gopal Shah | India |
Stan Jacob King | UK |
There's no built-in way to do this. You'll have to normalize the data so your keys are identical (or derive another field where the data is identical).
Have a look at e.g. https://community.qlik.com/t5/QlikView-App-Dev/Approximate-String-Matching-in-QlikView/td-p/520172 for techniques that might help you compare two similar but not identical strings.
You can try something with LevenshteinDistance https://help.qlik.com/en-US/sense/February2021/Subsystems/Hub/Content/Sense_Hub/Scripting/StringFunc...
First you make a Cartesian Join and then do the math. Be careful with big/huge data, rows...
Tab1:
LOAD 1 as TMP
, *
;
LOAD * INLINE [Name| ID
DES GUZMANSHIP SALASA, Danis Marc |1
SHAH GOPAL, saumya |2
JACOB KING, Stan |3
] (delimiter is '|');
join
LOAD 1 as TMP
, *
;
LOAD * INLINE [Name2|Country
Danis Marc Guzmanship Des Salasa |France
Saumya Gopal Shah |India
Stan Jacob King |UK
] (delimiter is '|');
Tab2:
LOAD *
, LevenshteinDist(lower(Name), lower(Name2)) as Score
Resident Tab1;
inner join (Tab2)
LOAD Name
, min(Score) as Score
Resident Tab2
Group By Name;
drop table Tab1;
drop field Score;