Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
In the middle of a flow, I need to find a string in another table, the two tables don't have any relations.
here is a sample:
Table CustomerInfo 100 000 rows
fields:
cust_id
cust_info
cust_info includes customer information like address, phone number, city, province separated by delimiters
I have a variable called prov extracted from above with the value of the province.
Table province 200 rows
id
name
code
country
If the prov's value exists in the province table(province name) I want to get the province code and country.
I tried to use this solution
https://community.talend.com/t5/Design-and-Development/Finding-keywords-in-a-tabel/m-p/151845#M92805
I used this code in a tMap base on the solution above, but the result is a cartesian as there are not any joins between tables.
code:
row21.name.contains(row6.prov)?row21.code:""
row21 is lookup from Table province and row 6 is from the main flow which I process Table CustomerInfo
I think I should use something like tJavaFlex and load the Table province in the starting part and in the middle for every row of CustomerInfo table is province_name exists or not. I am not sure and I don't know how.
Appreciate any help.
Hi,
If you are having exact match for province data with lookup table, then you can avoid Cartesian match. You need to add the row6.prov as matching key for lookup table data.
But if your lookup data is not in a standardized format, then you are in trouble. You can do mix and match by using contain clause to try to do matching. But I am not a fan of this as it may lead to data issues later. My suggestion would be always to standardize the entire address data as the first stage and use the clean data for matching in later stages.
Talend has already got lot of Address standardization components as shown below.
Its up to individual customer to select which service they would prefer to use. I have used Experian data before for one of the customers for a big UK customer and the output of standardized data was in excellent condition to use further processing.
If you are planning for simple contains clause, there might be a chance of wrong match which will lead to data privacy related issues in the era of GDPR (which may result in unwanted legal issues). So discuss about all these aspects with your architect team before deciding on the match clause on raw data.
Warm Regards,
Nikhil Thampi
Please appreciate our Talend community members by giving Kudos for sharing their time for your query. If your query is answered, please mark the topic as resolved 🙂
the row6.prov can be empty or has a province which is not in the look-up table.
So I used the join model: left join and match model: unique, but it does not return the data. I all so did trim for both sides.
But it does not return any data for exact matches.
That is why I didn't use the matching.
Hi,
Did you check whether the data is in same case for both main and lookup flows? It could be due to difference in case of the input data. Could you please do upper or lower case conversion and then try it? If you are still facing the problem, please share some 5 or 6 input records for both main and lookup flow for inspection.
Warm Regards,
Nikhil Thampi
Please appreciate our Talend community members by giving Kudos for sharing their time for your query. If your query is answered, please mark the topic as resolved 🙂