Skip to main content
Announcements
Join us at Qlik Connect for 3 magical days of learning, networking,and inspiration! REGISTER TODAY and save!
cancel
Showing results for 
Search instead for 
Did you mean: 
kdv
Contributor
Contributor

Sort of Fuzzy Matching

Warning, newbie question!

I have two files where I am trying to merge the data based on a specific field (inner join).  File A has a reasonably clean reference field and is easily parsed/used.  File B on the other hand is an amalgamation of data that comes from a variety of different sources and therefore the reference field comes in all sorts of shapes and sizes.  I want to be able to still match them though.  Here is a practical, fictitious example of a reference in the two files:

File A: "Joe Bloggs"

File B: Fund Transfer : JoeBloggsACME-883366133256 : JOE BLOGGS BLOGGS Debit Account: 12196895 Credit Account: 12856966

 

Here is another example (from the same two files as the above example) to help show how different it can be, even within the same files:

 

File A: 432046055941

File B: "REF 432046055941"

 

Clearly doing an inner join won't work.  However as you can see, there is enough common text between the two fields in the respective files that I should be able to match.  It is just that it is not consistent so impossible to build a string manipulation formula.  I have dabbled with using the tFuzzyMatch component.  But I didn't get great results and I suspect that is too "high brow" for my problem. 

 

Is there another component/setting anybody can suggest I use or point me in the right direction please?

 

Thanks

Labels (2)
1 Reply
vapukov
Master II
Master II

if it always (!!!) as described in examples - you just need 

 

StringHandling.INDEX("hello world!","hello") != -1

 

if reality more complicated - need think more