Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi.
Got a new problem that can't resolve by myself.
Customer database contains 1. car info 2. fact table with car number in comment field.
Problem: need some text parsing to create link between 1 and 2.
Example:
cars:
registrationNo, group
ABC123, Transport
BC234, Expedition
facts:
date, amount, comment
2015-01-01, 100, Insurance expenses for ABC 123
2015-01-02, 200, Insurance expenses for BC234 (document no 123456)
Need to assign car group for every fact row.
Preferred result table:
date, amount, comment, group
2015-01-01, 100, Insurance expenses for ABC 123, Transport
2015-01-02, 200, Insurance expenses for BC234 (document no 123456), Expedition
There are additional problems that make task even worse:
1. registration numbers in comments are entered not uniformly - sometimes they contain spaces and sometimes not
2. registratios numbers in comments varry in length - 5 or 6 chars (if without spaces)
3. registration numbers in comments might be in any place of comment (usually end but sometimes mid)
I have gut feeling like i need some mix of lookup and match functions just can't figure out how to do it exactly.
Any ideas?
From what I can tell, first, you need to find a way to get the RegNo in a uniform format. It seems from the Car table that they prefer to be without spaces.
You could load both tables, then create a new table that links them by loading the comments field and the RegNo field, using a key that is an if statement with the wildcard match.
New Table:
comment,
registrationNo,
if(wildmatch([comment], '*[registrationNo]*')=1, [registrationNo}) as KeyField
Hello DataDigger,
Definitely it is not a straight forward approach, and you might need little bit of tweaking around with the comment field. Please find attached eg file which i tried might be of any help. Based on my assumption thinking that the Reg numbers in the comment field will be seen after the word 'For'. Taking this as a consideration i have done a workaround.
Let me know if that of any help.
Thanks,
V.
Sorry for uncarefully selected set of data for an example. I accidently picked 2 similar texts.
Assumption about 'for' is not applicable here as comment may wildly warry.
Also i can't see any attached file
thank you crystles for an idea, will try it tomorrow and will post the results.
I did attached the file. However, as said i have used For word to get the Registration Numbers so you can ignore that then. Just wondering did you try crystles if condition?? Did you get what you are looking for.
Thanks,
V.
crystles your statement like this :
New Table:
comment,
registrationNo,
if(wildmatch(trim([comment]), '*[registrationNo]*')=1, [registrationNo}) as KeyField
There are several threads here in the forum with similar requests. One of the threads that covers some different approaches is
You may need to create a lookup table with all combinations of registration numbers (i.e. with different spelling, with and without spaces).
Yes I thought about putting the trim on the comments section, but I did not think it would matter as much since I was doing a Wildmatch. Trimming the comments field wouldn't have any advantage I think, but it could help find the Code faster.
Thanks!