Skip to main content
Announcements
SYSTEM MAINTENANCE: Thurs., Sept. 19, 1 AM ET, Platform will be unavailable for approx. 60 minutes.
cancel
Showing results for 
Search instead for 
Did you mean: 
DM_J
Contributor II
Contributor II

Find a string in another table

In the middle of a flow, I need to find a string in another table, the two tables don't have any relations.

 

here is a sample:

Table CustomerInfo  100 000 rows

fields:

cust_id

cust_info

 

cust_info includes customer information like address, phone number, city, province separated by delimiters

I have a variable called prov extracted from above with the value of the province.

 

Table province  200 rows

id

name

code

country

 

If the prov's value exists in the province table(province name) I want to get the province code and country.

 

I tried to use this solution

https://community.talend.com/t5/Design-and-Development/Finding-keywords-in-a-tabel/m-p/151845#M92805

 

I used this code in a tMap base on the solution above, but the result is a cartesian as there are not any joins between tables.

code:

row21.name.contains(row6.prov)?row21.code:""

 

row21 is lookup from Table province and row 6 is from the main flow which I process Table CustomerInfo 

 

 

0683p000009M4kP.png

 

I think I should use something like tJavaFlex and load the Table province in the starting part and in the middle for every row of CustomerInfo table is province_name exists or not. I am not sure and I don't know how.

 

Appreciate any help.

Labels (3)
3 Replies
Anonymous
Not applicable

Hi,

 

    If you are having exact match for province data with lookup table, then you can avoid Cartesian match. You need to add the row6.prov as matching key for lookup table data.

 

    But if your lookup data is not in a standardized format, then you are in trouble. You can do mix and match by using contain clause to try to do matching. But I am not a fan of this as it may lead to data issues later. My suggestion would be always to standardize the entire address data as the first stage and use the clean data for matching in later stages.

 

    Talend has already got lot of Address standardization components as shown below.

0683p000009M4OQ.png

 

Its up to individual customer to select which service they would prefer to use. I have used Experian data before for one of the customers for a big UK customer and the output of standardized data was in excellent condition to use further processing.

 

If you are planning for simple contains clause, there might be a chance of wrong match which will lead to data privacy related issues in the era of GDPR (which may result in unwanted legal issues). So discuss about all these aspects with your architect team before deciding on the match clause on raw data.

 

Warm Regards,
Nikhil Thampi

Please appreciate our Talend community members by giving Kudos for sharing their time for your query. If your query is answered, please mark the topic as resolved 🙂

 

 

DM_J
Contributor II
Contributor II
Author

the row6.prov can be empty or has a province which is not in the look-up table.

 

So I used the join model: left join and match model: unique, but it does not return the data. I all so did trim for both sides.

But it does not return any data for exact matches.

 

That is why I didn't use the matching. 

 

 

Anonymous
Not applicable

Hi,

 

    Did you check whether the data is in same case for both main and lookup flows? It could be due to difference in case of the input data. Could you please do upper or lower case conversion and then try it? If you are still facing the problem, please share some 5 or 6 input records for both main and lookup flow for inspection.

 

Warm Regards,
Nikhil Thampi

Please appreciate our Talend community members by giving Kudos for sharing their time for your query. If your query is answered, please mark the topic as resolved 🙂