Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
To join both tables based on Company name, but the company name have inconsistent prefixes.
COLOR MAP CORP | COLOR MAP CORPORATION |
Table1:
LOAD * INLINE [
ID, Company
1001, COLOR MAP CORP
1002, DENNISE PUBLISHING LTD
1003, "DSN GROUP, INC."
1004, DREAM TYPE LLC
1005, TECHNICAL SOLUTIONS (UK)
1006, INFO PRIVATE LIMITED
];
Table2:
LOAD * INLINE [
Company, City
COLOR MAP CORPORATION, London
DENNISE PUBLISHING , Delhi
"DSN GROUP, INC", Germany
DREAM TYPE , Tokyo
TECHNICAL SOLUTIONS , London
INFO PVT LTD., Paris
];
One solution is Levenshtein Distance.
VBA Macro:
Function levenshtein( a, b )
Dim i,j,cost,d,min1,min2,min3
' Avoid calculations where there there are empty words
If Len( a ) = 0 Then levenshtein = Len( b 😞 Exit Function
If Len( b ) = 0 Then levenshtein = Len( a 😞 Exit Function
' Array initialization
ReDim d( Len( a ), Len( b ) )
For i = 0 To Len( a 😞 d( i, 0 ) = i: Next
For j = 0 To Len( b 😞 d( 0, j ) = j: Next
' Actual calculation
For i = 1 To Len( a )
For j = 1 To Len( b )
If Mid(a, i, 1) = Mid(b, j, 1) Then cost = 0 Else cost = 1 End If
' Since min() function is not a part of VBScript, we'll "emulate" it below
min1 = ( d( i - 1, j ) + 1 )
min2 = ( d( i, j - 1 ) + 1 )
min3 = ( d( i - 1, j - 1 ) + cost )
If min1 <= min2 And min1 <= min3 Then
d( i, j ) = min1
ElseIf min2 <= min1 And min2 <= min3 Then
d( i, j ) = min2
Else
d( i, j ) = min3
End If
Next
Next
levenshtein = d( Len( a ), Len( b ) )
End Function
Script:
Table1:
LOAD * INLINE [
ID, Company
1001, COLOR MAP CORP
1002, DENNISE PUBLISHING LTD
1003, "DSN GROUP, INC."
1004, DREAM TYPE LLC
1005, TECHNICAL SOLUTIONS (UK)
1006, INFO PRIVATE LIMITED
];
Join
Table2:
LOAD * INLINE [
Company2, City
COLOR MAP CORPORATION, London
DENNISE PUBLISHING , Delhi
"DSN GROUP, INC", Germany
DREAM TYPE , Tokyo
TECHNICAL SOLUTIONS , London
INFO PVT LTD., Paris
];
tabOut:
LOAD *, Levenshtein(Company, Company2) As LevenDist
Resident Table1;
Drop Table Table1;
One solution is Levenshtein Distance.
VBA Macro:
Function levenshtein( a, b )
Dim i,j,cost,d,min1,min2,min3
' Avoid calculations where there there are empty words
If Len( a ) = 0 Then levenshtein = Len( b 😞 Exit Function
If Len( b ) = 0 Then levenshtein = Len( a 😞 Exit Function
' Array initialization
ReDim d( Len( a ), Len( b ) )
For i = 0 To Len( a 😞 d( i, 0 ) = i: Next
For j = 0 To Len( b 😞 d( 0, j ) = j: Next
' Actual calculation
For i = 1 To Len( a )
For j = 1 To Len( b )
If Mid(a, i, 1) = Mid(b, j, 1) Then cost = 0 Else cost = 1 End If
' Since min() function is not a part of VBScript, we'll "emulate" it below
min1 = ( d( i - 1, j ) + 1 )
min2 = ( d( i, j - 1 ) + 1 )
min3 = ( d( i - 1, j - 1 ) + cost )
If min1 <= min2 And min1 <= min3 Then
d( i, j ) = min1
ElseIf min2 <= min1 And min2 <= min3 Then
d( i, j ) = min2
Else
d( i, j ) = min3
End If
Next
Next
levenshtein = d( Len( a ), Len( b ) )
End Function
Script:
Table1:
LOAD * INLINE [
ID, Company
1001, COLOR MAP CORP
1002, DENNISE PUBLISHING LTD
1003, "DSN GROUP, INC."
1004, DREAM TYPE LLC
1005, TECHNICAL SOLUTIONS (UK)
1006, INFO PRIVATE LIMITED
];
Join
Table2:
LOAD * INLINE [
Company2, City
COLOR MAP CORPORATION, London
DENNISE PUBLISHING , Delhi
"DSN GROUP, INC", Germany
DREAM TYPE , Tokyo
TECHNICAL SOLUTIONS , London
INFO PVT LTD., Paris
];
tabOut:
LOAD *, Levenshtein(Company, Company2) As LevenDist
Resident Table1;
Drop Table Table1;