One table with transaction data [TRANSACTIONS]. In this table there is a (free format) field that contains the description of the transaction.
Have created a mapping table with keywords and categories [MAPPINGS].
Now I like to use the mapping table in such a way that it searches for the keywords in the description field of the TRANSACTION table. It can then take the categories of the first matching/found keyword.
Are there possibilities in QlikView to do such a fuzzy join or fuzzy mapping?
Well, it's definitely not possible with mapping since that is a very simple 1-input-1-output method. As far as I know there is no built-in way to do a fuzzy join either, although you might be able to code this. For example, you might be able to search for a keyword using the index() or substringcount() functions. However, I imagine this would only work if you don't have too many unique values in the Keyword field. Essentially, the solution I had in mind would look like this:
Transactions_new: LOAD DATE, VALUE, DESCRIPTION, CATEGORY RESIDENT TRANSACTIONS WHERE substringcount(DESCRIPTION,KEYWORD)>0;
DROP TABLE TRANSACTIONS;
This would create a Cartesian join (a join of every possible combination of both tables) because the fields Description and Keyword start off as disconnected. So, for example, if Transactions has 1M rows and Mappings has 10K rows, the resulting table from the outer join would contain 10B rows which I doubt your server (or your patience) could handle. So the short answer to your question is: maybe, if you don't have too many keywords.
Edit: By the way, if DESCRIPTION is not 100% unique, it would be cheaper to create an intermediate table that contains just DISTINCT DESCRIPTION and perform the outer join from MAPPINGS into that intermediate table. Afterwards, you can either leave the table as is or left join it back into TRANSACTIONS with CATEGORY now included.
Amien wrote: that's for wildcards .. not fuzzy matching i was really looking for a load solution<div></div>
Wigchert asked for "searches for the keywords", which the wildcard mapping can solve. It IS a load solution. The "no hits" check that Vlad is talking about can easily be added to the load if you want that additional piece.
But you appear to mean something different by "fuzzy matching". What do YOU mean by it? Something like if one table has "Amien" and another has "amian", it'll match them anyway, because they're close? So you'd need some sort of "closeness function", and then map to the row with the highest value for the "closeness function"? Something like that? That would be challenging.
Amien wrote: i had the following issues with the solution in the cookbook, perhaps someone allready found a solution for this? : * issues when i have 2 hits * i wanted to put the no hits in a seperate category, like 'others'
In the cookbook example, the last entry in the Wildmatch table is: *, Other * matches everything. This is intended to catch the "no-hits" and assign them to "Other".
What's the issue re 2 hits? Do you want to generate another row for the second (or more) hits?