Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Helllo all,
I have a table with two columns ' Old address' and 'New address'. I need to compare these values and identify it they are 100 % mismatch or new address is subset of old address ?. In case there is addition of new character or deletion of existing character, then those needs to be tracked in last column.
Please suggest. Thanks in advance.
Try like:
Load
*,
If(New=Old, 'ExactMatch','SubSet') as Result,
If(New<>Old,If(len(Old)>Len(New),chr(39)& Replace(Old,New,'')&chr(39)&' removed','Added '&chr(39)&Replace(New,Old,'')&chr(39))) as Discrepancy
Change field name to yours.
This could be useful: Fuzzy Matching/Joining
Hello Tresesco,
Thanks a lot for your reply. I have tried implementing your solution . However for few scenario's its not working. May I know the role of Chr(39) ? Please find ticked records where the results are not as expected.
These values actually neither match nor become subset. If you have this third possibility too, you have to work a bit more on script.
May be like this?
Load
*,
If( Result='SubSet',
If(New<>Old,If(len(Old)>Len(New),chr(39)& Replace(Old,New,'')&chr(39)&' removed','Added '&chr(39)&Replace(New,Old,'')&chr(39)))
) as Discrepancy;
Load *,
If(New=Old, 'ExactMatch',If(Index(Old,New) or Index(New,Old),'SubSet', 'NotEvenSubset')) as Result
Inline [
Old, New
Abc, Abc
Abcc, Abc
Bcd, ABcd
XXX, XXYY
];
And yes, chr(39) is for inverted comma (')
You can use Tesesco's solution after cleaning all the non alphanumeric characters and then performing the match:
Load
*,
If(New = Old, 'ExactMatch', 'SubSet') as Result,
If(New <> Old,
If(len(Old) > Len(New),
chr(39) & Replace(Old,New,'') & chr(39) & ' removed',
'Added ' & chr(39) & Replace(New,Old,'') & chr(39))
) as Discrepancy;
Load
*,
KeepChar(Lower(OLD_VALUE), 'abcdefghijklmnopqrstuvwxyz0123456789') as Old,
KeepChar(Lower(NEW_VALUE), 'abcdefghijklmnopqrstuvwxyz0123456789') as New
FROM ...
Thanks again for quick work around.
Can we have more details for 'Notevensubset' records as they are also more of sub sets only with few character changes.
In above record '-' is removed in new address. so can we identify such character discrepancies ?
and sorry for asking you again, may I know importance of Char (39) ?
Chr(39) means single quote