Skip to main content
Announcements
See what Drew Clarke has to say about the Qlik Talend Cloud launch! READ THE BLOG
cancel
Showing results for 
Search instead for 
Did you mean: 
fredrik_olsson
Contributor III
Contributor III

Match 2 comma separated lists in a resource efficient way

Hi,

I'm trying to find a resource efficient solution to see if any value in the first comma separated list matches any value in the second comma separated list (case insensitive).

list1 = 'TOMMY,MARTIN,CHARLIE,VICTORIA';
list2 = 'FRED,VICTOR,VICTORIA,ELLEN,JOE';

if (wildMatch(list1, list2), 'matches', 'no matches')

Any help is appreciated. 

Labels (1)
7 Replies
Or
MVP
MVP

Probably the easiest way to do this would be to subfield() the first list (in script) and then wildmatch the subfielded list with the second list. Whether or not this would be resource efficient depends on both the number of rows this needs to be applied to and the typical length of the lists.

fredrik_olsson
Contributor III
Contributor III
Author

It's applied to approx 500.000 rows in the load script. 
Let's say the data after some preparation looks something like this:

OrderID, List1, List2
123123, 'TOMMY,MARTIN,CHARLIE,VICTORIA', 'FRED,VICTOR,VICTORIA,ELLEN,JOE'


Is there a faster / more efficient way of doing it?

 

marcus_sommer

Did you try the suggestion from @Or ? Regarding to your provided information such subfield-load would create 2 M of records and a wildmatch against it should also not take very long. If it's too long what would be your expectation?

- Marcus 

fredrik_olsson
Contributor III
Contributor III
Author

Thanks @marcus_sommer & @Or for trying to help out.

I tried with subfield but I can't get a match for same values in different locations in the different lists:

=if (WILDMATCH('A1,B2', SubField('C3,A1', ',')), 'Match', 'No match')

 

I can't get A1 to match in the example above. Any help is appreciated. 

marcus_sommer

You may try it in this way:

load *, rowno() as RowNo, wildmatch(List1, '*' & ListPart & '*') as Flag;
load *, subfield(List2,',') as ListPart, recno() as RecNo;
load * inline [
OrderID, List1, List2
123123, 'TOMMY,MARTIN,CHARLIE,VICTORIA', 'FRED,VICTOR,VICTORIA,ELLEN,JOE'
];

- Marcus

MarcoWedel

another solution might be

If(Evaluate('WildMatch('','&List1&','&Chr(39)&',''*,'&Replace(List2,',',',*'',''*,')&',*'')'),'matches', 'no matches') as Match

though not quite sure about the performance of the string operations and evaluation in comparison to the subfield approach.

MarcoWedel

slightly shorter

If(Evaluate('WildMatch('','&List1&','',''*,'&Replace(List2,',',',*'',''*,')&',*'')'),'matches', 'no matches') as Match