Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi all!
I am looking for a solution in order to be able to find similar or duplicate records in my tables. Does anyone know an expression that for example looks at a record, notes the given number of characters and looks for it in other records?
Any help would be appreciated!
Thx,
Br
T
Firstly you need to have list of names that you wish to compare too.
Then try this link
qlikviewcookbook.com/list-recipes/
and look for Mapping With Wildcards
Could you provide an example?
Hi,
an example:
this name has been entered twice:
Matik Zamad Albas |
Zamad Albas Matim |
Not only the order, but also the 'Malim' part is different, but I want an expression that recognizes them as similar names/records
Hi,
If requirement is just for 5-6 names than you can use this :
LOAD Name,
pick( WildMatch(Name,'*Zamad Albas Matim*','*Name2*','*Name3*'),'Matik Zamad Albas','P2','P3')
as Data
FROM
(ooxml, embedded labels, table is Sheet3);
Else for bigger data , I don't think so it is possible.
Let's wait for others opinion.
Thanks,
AS
Hi amit,
the problem is, that I need this on more than 5 thousand names, so it is not an easy task to solve...
Tibor,
If this is the case than sorry bro , not possible.
Thanks,
AS
Guys, does anyone know a solution that is just nearing to what I would like to achieve? There must be someone who has encountered a similar problem....
Are there any rules to the names reordering or changes?
For example, you've said number of characters - you can use len() to get the string length of your field value, so if the name was always the same length or had a threshold for change (1 to 2 characters for example) you can define that as your first 'rule'.
If one of the names (say the middle name) will always be consistently in the string somewhere, you can do a search for it using the index() function matching with the subfield() function or something similar.
If there aren't really any definitive rules, I think you're going to be out of luck. Your best bet is to look at fuzzy matching algorithms and see what an acceptable tolerance level for mistakes is.
Firstly you need to have list of names that you wish to compare too.
Then try this link
qlikviewcookbook.com/list-recipes/
and look for Mapping With Wildcards
Thanks guys, I think you are right, no perfect solution exists to this problem yet. I managed to fabricate an expression that is somewhat generating the results I wanted (with a combination of len() and purgechar()) but it is far from perfect. Thanks anyways the comments, I really apprieciate your advices!
Br
Tibor