Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi,
I'm looking for a possibility for string matching with fuzzy(-search), trigram (n-gram), levenshtein, etc. in QV script.
Any suggestions?
Ralf
Hi Karen,
I found a workable VBScript implementation as a function. This can be used during LOAD on record level. So you would need to join the source data first:
LOAD Script:
Levenshtein:
LOAD F1, F2, levenshtein(F1,F2) as distance;
LOAD * INLINE [
F1, F2
Qlik, Qlik ltd
Qlik ltd, Qlik Limited
Qlik Limited, QlikTech
Qlik, Klik
];
Module:
' Source:
' http://en.wikibooks.org/wiki/Algorithm_Implementation/Strings/Levenshtein_distance#VBScript
Function levenshtein( a, b )
Dim i,j,cost,d,min1,min2,min3
' Avoid calculations where there there are empty words
If Len( a ) = 0 Then levenshtein = Len( b 😞 Exit Function
If Len( b ) = 0 Then levenshtein = Len( a 😞 Exit Function
' Array initialization
ReDim d( Len( a ), Len( b ) )
For i = 0 To Len( a 😞 d( i, 0 ) = i: Next
For j = 0 To Len( b 😞 d( 0, j ) = j: Next
' Actual calculation
For i = 1 To Len( a )
For j = 1 To Len( b )
If Mid(a, i, 1) = Mid(b, j, 1) Then cost = 0 Else cost = 1 End If
' Since min() function is not a part of VBScript, we'll "emulate" it below
min1 = ( d( i - 1, j ) + 1 )
min2 = ( d( i, j - 1 ) + 1 )
min3 = ( d( i - 1, j - 1 ) + cost )
If min1 <= min2 And min1 <= min3 Then
d( i, j ) = min1
ElseIf min2 <= min1 And min2 <= min3 Then
d( i, j ) = min2
Else
d( i, j ) = min3
End If
Next
Next
levenshtein = d( Len( a ), Len( b ) )
End Function
Hope this helps..
- Ralf
Ralf,
suggest that you develop those functions and share them with the rest of us
Kidding aside - those would make excellent improvement requests. I just don't know how high would it be on the priority list, since the need is quite exotic...
take care,
Oleg
Oleg,
thx for your suggestion but, VBScript isn't the right place for it. We're playing around with some C++ implementations but this still needs a VBScript call and a separate dll...
Would love a QV script improvement!
Ralf
I'd look at calling a VBscript function from QlikView. And once you're in VBscript-land, you can call an external library.
..maybe there is something new in QV 9 ?
Doesn't appear to be.
Ralf,
why don't you request it as an "idea" and then convince other people to "second" your movement?
Oleg
Oleg,
I don't really know how or where to do this here. I'm a bit new.. 8-)
Rafl
i hope this in v10 🙂
i want this too..
its not possible to use this VBA script .. and then use it in an expression?
http://en.wikibooks.org/wiki/Algorithm_Implementation/Strings/Levenshtein_distance
using something similiair with regex