Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Register by January 31 for $300 off your Qlik Connect pass: Register Now!
cancel
Showing results for 
Search instead for 
Did you mean: 
rbecher
Partner - Master III
Partner - Master III

string matching with fuzzy, trigram (n-gram), levenshtein, etc.

Hi,

I'm looking for a possibility for string matching with fuzzy(-search), trigram (n-gram), levenshtein, etc. in QV script.

Any suggestions?

Ralf

Data & AI Engineer at Orionbelt.ai - a GenAI Semantic Layer Venture, Inventor of Astrato Engine
1 Solution

Accepted Solutions
rbecher
Partner - Master III
Partner - Master III
Author

Hi Karen,

I found a workable VBScript implementation as a function. This can be used during LOAD on record level. So you would need to join the source data first:

LOAD Script:

Levenshtein:

LOAD F1, F2, levenshtein(F1,F2) as distance;

LOAD * INLINE [

    F1, F2

    Qlik, Qlik ltd

    Qlik ltd, Qlik Limited

    Qlik Limited, QlikTech

    Qlik, Klik

];

Module:

' Source:

' http://en.wikibooks.org/wiki/Algorithm_Implementation/Strings/Levenshtein_distance#VBScript

Function levenshtein( a, b )

    Dim i,j,cost,d,min1,min2,min3

' Avoid calculations where there there are empty words

    If Len( a ) = 0 Then levenshtein = Len( b 😞 Exit Function

    If Len( b ) = 0 Then levenshtein = Len( a 😞 Exit Function

' Array initialization   

    ReDim d( Len( a ), Len( b ) )

    For i = 0 To Len( a 😞 d( i, 0 ) = i: Next

    For j = 0 To Len( b 😞 d( 0, j ) = j: Next

' Actual calculation

    For i = 1 To Len( a )

        For j = 1 To Len( b )

                        If Mid(a, i, 1) = Mid(b, j, 1) Then cost = 0 Else cost = 1 End If

            ' Since min() function is not a part of VBScript, we'll "emulate" it below

            min1 = ( d( i - 1, j ) + 1 )

            min2 = ( d( i, j - 1 ) + 1 )

            min3 = ( d( i - 1, j - 1 ) + cost )

            If min1 <= min2 And min1 <= min3 Then

                d( i, j ) = min1

            ElseIf min2 <= min1 And min2 <= min3 Then

                d( i, j ) = min2

            Else

                d( i, j ) = min3

            End If

        Next

    Next

    levenshtein = d( Len( a ), Len( b ) )

End Function

Hope this helps..

- Ralf

Data & AI Engineer at Orionbelt.ai - a GenAI Semantic Layer Venture, Inventor of Astrato Engine

View solution in original post

26 Replies
Oleg_Troyansky
Partner Ambassador/MVP
Partner Ambassador/MVP

Ralf,

suggest that you develop those functions and share them with the rest of us Wink

Kidding aside - those would make excellent improvement requests. I just don't know how high would it be on the priority list, since the need is quite exotic...

take care,

Oleg

Ask me about Qlik Sense Expert Class!
rbecher
Partner - Master III
Partner - Master III
Author

Oleg,

thx for your suggestion but, VBScript isn't the right place for it. We're playing around with some C++ implementations but this still needs a VBScript call and a separate dll...

Would love a QV script improvement!

Ralf

Data & AI Engineer at Orionbelt.ai - a GenAI Semantic Layer Venture, Inventor of Astrato Engine
Not applicable

I'd look at calling a VBscript function from QlikView. And once you're in VBscript-land, you can call an external library.

rbecher
Partner - Master III
Partner - Master III
Author

..maybe there is something new in QV 9 ?

Data & AI Engineer at Orionbelt.ai - a GenAI Semantic Layer Venture, Inventor of Astrato Engine
Not applicable

Doesn't appear to be.

Oleg_Troyansky
Partner Ambassador/MVP
Partner Ambassador/MVP

Ralf,

why don't you request it as an "idea" and then convince other people to "second" your movement?

Oleg

Ask me about Qlik Sense Expert Class!
rbecher
Partner - Master III
Partner - Master III
Author

Oleg,

I don't really know how or where to do this here. I'm a bit new.. 8-)

Rafl

Data & AI Engineer at Orionbelt.ai - a GenAI Semantic Layer Venture, Inventor of Astrato Engine
amien
Specialist
Specialist

i hope this in v10 🙂

i want this too..

amien
Specialist
Specialist

its not possible to use this VBA script .. and then use it in an expression?

http://en.wikibooks.org/wiki/Algorithm_Implementation/Strings/Levenshtein_distance

using something similiair with regex