Skip to main content
Announcements
A fresh, new look for the Data Integration & Quality forums and navigation! Read more about what's changed.
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

Regex Match in tMap Expression Builder

Hi,
I am new to Talend and want to say so far I am very impressed. I deal with tons of data that is very inconsistent and am trying to find a quick example of using a regular expression in the Expression builder within a tMap. After hunting around I found this gem, and though it would be useful to others:
Given you have the equivelent of an Enum data type you want to use.. and people keep giving you bad data like:
row1.Classification data:
brown pear
rotten apples
salt rocks
red apples
green apple
pear pectin

This code in the expression builder:
row1.Classification.toLowerCase().matches(".*apple.*")?"Apple":row1.Classification.toLowerCase().matches(".*pear.*")?"Pear":row1.Classification.toLowerCase().matches(".*salt.*")?"Salt":"Unknown"

yeilds clean data:
Pear
Apple
Salt
Apple
Apple
Pear
Labels (2)
2 Replies
alevy
Specialist

You can also make your regex case-insensitive i.e. instead of .toLowerCase().matches(".*pear.*") use .matches("(?i).*pear.*")
Anonymous
Not applicable
Author

I'm trying to use the tMap expression builder to apply a regex match on VAT registration numbers based on the country. I want to do this using a lookup variable rather than writing an explicit expression in the tMap for each of the country numbering format variations.
I can get the expression below to work for UK VAT numbers...
Relational.ISNULL(row3.Vat_Reg_Number)==false&&row3.Vat_Reg_Number.matches("^()*((\\d{8})|(\\d{11})|(GD\\d{2})|(HA\\d{2}))$")==true? "VAT number OK" : "VAT number is incorrectly formatted"
I?m looking to do something like this?
Relational.ISNULL(row3.Vat_Reg_Number)==false&&row3.Vat_Reg_Number.matches((row4.vat_reg_regex))==true? "VAT number OK" : "VAT number is incorrectly formatted"
?where ( row4.vat_reg_regex ) is a lookup from a database table keyed on the country. So far I have not been able to get this to work.
Does anyone know if this approach is workable or not?