
Anonymous
Not applicable
2012-07-27
12:18 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Regex Match in tMap Expression Builder
Hi,
I am new to Talend and want to say so far I am very impressed. I deal with tons of data that is very inconsistent and am trying to find a quick example of using a regular expression in the Expression builder within a tMap. After hunting around I found this gem, and though it would be useful to others:
Given you have the equivelent of an Enum data type you want to use.. and people keep giving you bad data like:
row1.Classification data:
This code in the expression builder:
yeilds clean data:
I am new to Talend and want to say so far I am very impressed. I deal with tons of data that is very inconsistent and am trying to find a quick example of using a regular expression in the Expression builder within a tMap. After hunting around I found this gem, and though it would be useful to others:
Given you have the equivelent of an Enum data type you want to use.. and people keep giving you bad data like:
row1.Classification data:
brown pear
rotten apples
salt rocks
red apples
green apple
pear pectin
This code in the expression builder:
row1.Classification.toLowerCase().matches(".*apple.*")?"Apple":row1.Classification.toLowerCase().matches(".*pear.*")?"Pear":row1.Classification.toLowerCase().matches(".*salt.*")?"Salt":"Unknown"
yeilds clean data:
Pear
Apple
Salt
Apple
Apple
Pear
569 Views
2 Replies

Specialist
2012-07-29
08:48 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You can also make your regex case-insensitive i.e. instead of .toLowerCase().matches(".*pear.*") use .matches("(?i).*pear.*")
569 Views

Anonymous
Not applicable
2012-11-28
03:49 AM
Author
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I'm trying to use the tMap expression builder to apply a regex match on VAT registration numbers based on the country. I want to do this using a lookup variable rather than writing an explicit expression in the tMap for each of the country numbering format variations.
I can get the expression below to work for UK VAT numbers...
Relational.ISNULL(row3.Vat_Reg_Number)==false&&row3.Vat_Reg_Number.matches("^()*((\\d{8})|(\\d{11})|(GD\\d{2})|(HA\\d{2}))$")==true? "VAT number OK" : "VAT number is incorrectly formatted"
I?m looking to do something like this?
Relational.ISNULL(row3.Vat_Reg_Number)==false&&row3.Vat_Reg_Number.matches((row4.vat_reg_regex))==true? "VAT number OK" : "VAT number is incorrectly formatted"
?where ( row4.vat_reg_regex ) is a lookup from a database table keyed on the country. So far I have not been able to get this to work.
Does anyone know if this approach is workable or not?
I can get the expression below to work for UK VAT numbers...
Relational.ISNULL(row3.Vat_Reg_Number)==false&&row3.Vat_Reg_Number.matches("^()*((\\d{8})|(\\d{11})|(GD\\d{2})|(HA\\d{2}))$")==true? "VAT number OK" : "VAT number is incorrectly formatted"
I?m looking to do something like this?
Relational.ISNULL(row3.Vat_Reg_Number)==false&&row3.Vat_Reg_Number.matches((row4.vat_reg_regex))==true? "VAT number OK" : "VAT number is incorrectly formatted"
?where ( row4.vat_reg_regex ) is a lookup from a database table keyed on the country. So far I have not been able to get this to work.
Does anyone know if this approach is workable or not?
569 Views
