Skip to main content
Announcements
See what Drew Clarke has to say about the Qlik Talend Cloud launch! READ THE BLOG
cancel
Showing results for 
Search instead for 
Did you mean: 
Lauri
Specialist
Specialist

Best method for partial text match

I'm looking for advice on the best way to match two lists, to make it run fastest.

I have two lists of people, containing first name, last name and date of birth. Many of the people are the same, but sometimes the names are spelled slightly differently. 

My current method does everything in the load script: I load the two lists into one table with different field names (like Vax.LastName and EHR.LastName). I then use an IF statement to compare the left 3 characters of the two LastName fields and the left 3 of the FirstName fields and the DOB fields. I flag the people that match. This snippet comes after I put the left 3 from Vax.LastName into a field called Vax.Last.)

If(Vax.DOB=EHR.DOB and Vax.Last=EHR.Last and Vax.First=EHR.First, 1)

The problem: The lists have only about 15,000 and 45,000 names and yet this takes almost 10 minutes! 

Is there a better way in the load script or with Set Analysis?

1 Solution

Accepted Solutions
Or
MVP
MVP

If the birth date is guaranteed to be identical, which seems to be the case, you could use that to join the two tables and then Load Resident on the resulting table. This should prevent you from dealing with a cartesian join of 675m rows and make the whole thing run much faster.

View solution in original post

3 Replies
Or
MVP
MVP

If the birth date is guaranteed to be identical, which seems to be the case, you could use that to join the two tables and then Load Resident on the resulting table. This should prevent you from dealing with a cartesian join of 675m rows and make the whole thing run much faster.

JuanGerardo
Partner - Specialist
Partner - Specialist

Hi @Lauri, this is a very small dataset for Qlik Sense, and I agree 10 minutes is too much time for that If() comparison. I suspect the time is spent on the other operations, probably in the way you load and manipulate your tables.

JG

Lauri
Specialist
Specialist
Author

Thank you, @Or . You got me back on track with a Join on birth date, and now it runs in 6 seconds!!