Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
See why IDC MarketScape names Qlik a 2025 Leader! Read more
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

Multiple regex exclude filemask on tFileList

Hello community,

 

-- Version : Tos BD 6.5.1

 

I have an issue with regex applied on Talend component tFileList.

To be honest i think it is a problem that needs to be fixed by the designers (if it's has not been already done in 6.5+)

The issue appears when i want to use a regular expression to exclude some files in my tfilelist execution.

 

For my example, i need to exclude 2 different types of files :

context.fileMask_1 = "^\\d+_{1}\\d+_{1}[A-Z]{3,4}(\\.pdf){1}$";
context.fileMask_2 = "^([A-Z]|\\d)+_{1}([A-Z]|\\d)+(\\.pdf){1}$";

In my tfileList i check the 'Use Exclude Filemask' case and insert the following code :

context.fileMask_2+","+context.fileMask_1

Got it? you can guess here there will be an issue since Talend have decided we can aggregate several excluding expression (that is a really good thing) using a comma!.

--

--

--

Still don't get it ? this error message will help you understand the issue:

Exception in component tFileList_4 (testtet)
java.util.regex.PatternSyntaxException: Unclosed counted closure near index 22
^\d+_{1}\d+_{1}[A-Z]{3
	at java.util.regex.Pattern.error(Unknown Source)
	at java.util.regex.Pattern.closure(Unknown Source)
	at java.util.regex.Pattern.sequence(Unknown Source)
	at java.util.regex.Pattern.expr(Unknown Source)
	at java.util.regex.Pattern.compile(Unknown Source)
	at java.util.regex.Pattern.<init>(Unknown Source)
	at java.util.regex.Pattern.compile(Unknown Source)
	at p_compta.testtet_0_1.testtet.tFileList_4Process(testtet.java:2037)
	at p_compta.testtet_0_1.testtet.tFileList_5Process(testtet.java:3361)
	at p_compta.testtet_0_1.testtet.tLoop_2Process(testtet.java:3639)
	at p_compta.testtet_0_1.testtet.runJobInTOS(testtet.java:3917)
	at p_compta.testtet_0_1.testtet.main(testtet.java:3752)

This error log shows us talend cuts my regex by the comma : "^\\d+_{1}\\d+_{1}[A-Z]{3,4}(\\.pdf){1}$"

 

So if someone has ever faced and avoided that problem successfully, i would be gratefull

 

ps : Trying to escape a comma is not a good idea on regular expressions 0683p000009MAB6.png

 

Best regards,

Pierre

Labels (2)
1 Solution

Accepted Solutions
Anonymous
Not applicable
Author

Hi Pierre
I have made a testing, I think Use external filemask does not support regex expression, as we can use a comma in a regex expression as the filemask in the basic setting of tFileList. I have open a jira issue for our R&D team to investigate it.
https://jira.talendforge.org/browse/TDI-43561

Regards
Shong

View solution in original post

2 Replies
Anonymous
Not applicable
Author

Hi Pierre
I have made a testing, I think Use external filemask does not support regex expression, as we can use a comma in a regex expression as the filemask in the basic setting of tFileList. I have open a jira issue for our R&D team to investigate it.
https://jira.talendforge.org/browse/TDI-43561

Regards
Shong
Anonymous
Not applicable
Author

Hello S.Hong

 

Thank you for your attention,

I am happy to help improving the TOS Solution.

 

Regards,

Pierre