Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Qlik Connect 2026 Agenda Now Available: Explore Sessions
cancel
Showing results for 
Search instead for 
Did you mean: 
agentgill
Contributor
Contributor

Regex to match repeated letters in string using java pattern

Talend version: 4.2.3 / 5.2.0M3
OS: Windows / Mac
I am trying to parse out phone numbers which have repeating characters e.g.
0000, 111111, 99999999, 8888, 22222, 000000000 - basically anything which is repeat
I am using the following RegEx in the tMap
row1.Phone.matches("()\1{3}")?"":row1.Phone
or this
row1.Phone.matches("()\\1{3}")?"":row1.Phone (parse out forward slash for java....
When testing the expression I get this
Exception in thread "Main" java.lang.error : Unresolved compilation errors
This works outside Talend - see 2nd SS.
Any ideas?
Labels (3)
4 Replies
sizhaoliu
Contributor III
Contributor III

Actually, a job is generated behind the scene to handle this record.
As your value to test is not quoted, the generated job contains expression like <b>123.matches(...)</b> which does not fullfil the java syntax
You can try to replace all the occurences of <row1.Phone> by <String.valueOf(row1.Phone)> or even <row1.Phone+""> to avoid compilation problems.
Although this compile problem does not appear if you run the job. it is a bit weird in the test area from the point of view of users.
Another thing is that the method "string.matches(regex)" will only filter the records like "1111" "2222", but not "12222" or "22221". Same results with "Pattern.matches(regex, string)".
So I propose to use the following expression to filter the inputs containing repetition inside:
java.util.regex.Pattern.compile("()\\1{3}").matcher(String.valueOf(row1.Phone)+"").find() ? "" : String.valueOf(row1.Phone)
This can work in the test area too since I added "String.valueOf"
agentgill
Contributor
Contributor
Author

Thanks, no more compilation errors. I see now, very powerful. It's possible to call external Java classes!
Very close now, but not sure all the use cases are covered.
A phone number like this results in null
0800 2222 1234
It's beginning to look like RegEx is not the best answer. What do you think / recommend?
Thanks
agentgill
Contributor
Contributor
Author

I'm thinking a custom Routine. Get the first char and string length, then compare with original string
agentgill
Contributor
Contributor
Author

Final solution is this - added a custom routine which I call in the tMap
package routines;
public class FT_CompareString {

public static String CheckString(String input) {

if (input != null && input != "" && input.replace(" ", "").split(input.replace(" ", "").substring(0,1)).length >0){
return input;
}
else{
return null;
}


}
}