Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi all,
I have been pulling my hair out with this one! and could really do with some help!
I have a table of medical terms (~10,000) and I want to map these to a more refined dictionary set of terms (~2000).
(for e.g. the doctors describe "inflammation", "inflammatory", "inflammatory lesion", "necrosis and inflammation" when actually they all just mean "inflammation")
I also have a pretty comprehensive dictionary that takes particular medical strings and maps them to specific dictionary terms. (e.g. "inflammation and necrosis" would map to both "inflammation" and "necrosis"
When I run a direct match of the original strings to the dictionary strings I get around a 50% hit rate. But many of the missing hits are due to odd punctuation or word order (e.g. my dictionary might miss "inflammation/necrosis,")
So I am trying to build a load script that splits out each word from the original string, does the same for the dictionary term, then for each original term and each dictionary term it tries to matches each word. The end result would be to use the dictionary term that has the most number of word matches in the original string.
I can create two separate tables with the string subfields all ready for the match, but I cant get my head round how to do the matching.
Could you spend a moment and see if you could help me? I would be really grateful.
I have attached the .qvw
Thanks
Mark
I mused on this for a while and eventually came up with a solution:
See attached .qvw