Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Join us in Bucharest on Sept 18th for Qlik's AI Reality Tour! Register Now
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

Multiple regex exclude filemask on tFileList

Hello community,

 

-- Version : Tos BD 6.5.1

 

I have an issue with regex applied on Talend component tFileList.

To be honest i think it is a problem that needs to be fixed by the designers (if it's has not been already done in 6.5+)

The issue appears when i want to use a regular expression to exclude some files in my tfilelist execution.

 

For my example, i need to exclude 2 different types of files :

context.fileMask_1 = "^\\d+_{1}\\d+_{1}[A-Z]{3,4}(\\.pdf){1}$";
context.fileMask_2 = "^([A-Z]|\\d)+_{1}([A-Z]|\\d)+(\\.pdf){1}$";

In my tfileList i check the 'Use Exclude Filemask' case and insert the following code :

context.fileMask_2+","+context.fileMask_1

Got it? you can guess here there will be an issue since Talend have decided we can aggregate several excluding expression (that is a really good thing) using a comma!.

--

--

--

Still don't get it ? this error message will help you understand the issue:

Exception in component tFileList_4 (testtet)
java.util.regex.PatternSyntaxException: Unclosed counted closure near index 22
^\d+_{1}\d+_{1}[A-Z]{3
	at java.util.regex.Pattern.error(Unknown Source)
	at java.util.regex.Pattern.closure(Unknown Source)
	at java.util.regex.Pattern.sequence(Unknown Source)
	at java.util.regex.Pattern.expr(Unknown Source)
	at java.util.regex.Pattern.compile(Unknown Source)
	at java.util.regex.Pattern.<init>(Unknown Source)
	at java.util.regex.Pattern.compile(Unknown Source)
	at p_compta.testtet_0_1.testtet.tFileList_4Process(testtet.java:2037)
	at p_compta.testtet_0_1.testtet.tFileList_5Process(testtet.java:3361)
	at p_compta.testtet_0_1.testtet.tLoop_2Process(testtet.java:3639)
	at p_compta.testtet_0_1.testtet.runJobInTOS(testtet.java:3917)
	at p_compta.testtet_0_1.testtet.main(testtet.java:3752)

This error log shows us talend cuts my regex by the comma : "^\\d+_{1}\\d+_{1}[A-Z]{3,4}(\\.pdf){1}$"

 

So if someone has ever faced and avoided that problem successfully, i would be gratefull

 

ps : Trying to escape a comma is not a good idea on regular expressions 0683p000009MAB6.png

 

Best regards,

Pierre

Labels (2)
1 Solution

Accepted Solutions
Anonymous
Not applicable
Author

Hi Pierre
I have made a testing, I think Use external filemask does not support regex expression, as we can use a comma in a regex expression as the filemask in the basic setting of tFileList. I have open a jira issue for our R&D team to investigate it.
https://jira.talendforge.org/browse/TDI-43561

Regards
Shong

View solution in original post

2 Replies
Anonymous
Not applicable
Author

Hi Pierre
I have made a testing, I think Use external filemask does not support regex expression, as we can use a comma in a regex expression as the filemask in the basic setting of tFileList. I have open a jira issue for our R&D team to investigate it.
https://jira.talendforge.org/browse/TDI-43561

Regards
Shong
Anonymous
Not applicable
Author

Hello S.Hong

 

Thank you for your attention,

I am happy to help improving the TOS Solution.

 

Regards,

Pierre