Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Discover how organizations are unlocking new revenue streams: Watch here
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

[resolved] tFilterRow Advanced Mode Regex match fails?

Hi All,
I have a simple job that iterates through a directory of files. I am seeking to filter the files where the name matches a list of regular expressions. I want to report on those files that match, and those that dont. (Those that match will continue for further processing).
I have:
tFileList -> tFileProperties -> tFilterRow - tLogRow (filter)
|
------- tLogRow (reject)
In the configuration of the tFilterRow, I have specified "advanced mode" and used the statement:
input_row.basename.matches("^\\d+")
as my initial test is simply to identify files beginning with any sequence of digits.
Currently all rows route through the reject log.
I have tested the same regex statement with tExtractRegexFields and it worked.
Does tFilterRow NOT support regex in this way?
Thanks
Labels (2)
1 Solution

Accepted Solutions
Anonymous
Not applicable
Author

Hi souha,
tMemorizeRows component is from Talend Open Studio for data integration and not from paid / enterprise ...
Which version of talend you are using?
Another way could be to save the value of respective column to some context variable and then checking that value with the current row... after checking the value, at the end again save current value to context variable...
Vaibhav

View solution in original post

13 Replies
Anonymous
Not applicable
Author

Hi,
So far, tFilterRow don't support for regex.
Could you please elaborate your case with an example with input and expected output values so that we can see if there is an alternative solution for your case.
Best regards
Sabrina
Anonymous
Not applicable
Author

Thanks Sabrina,

I note that the documentation suggests that regex is supported; might want to have that updated 0683p000009MA9p.png
https://help.talend.com/search/all?query=tFilterRow&content-lang=en
"In the text field, type in the regular expression as required."

Anyway, my use case is essentially an ETL migration from a filesystem and database through to a new target system. The filesystem contains binary files, the database contains metadata. A primary key (id) number links the file to the database record.
I want to iterate through the file-system to identify and join the files to the metadata, then transform them into a new format for import into a new system.
I want to report on files and folders that were migrated, or failed, because they were unidentifiable etc..
I have a file-system containing about 30Tb of various image files and folders.
Examples would be:
12345_sometitle.jpg
46602_shot.psd
Latest_3498912.gif
Some_3452_file.bmp
fail_example.jpg
As I iterate through the file-system, I want to evaluate both files and folders against a regular expression that is designed to pickup an ID number that may or may not be in the name of the file/folder. There may be multiple regexes to support different criteria.
If one of the criteria matches, I want to extract the ID, then continue processing -- I will do a join to another data source to identify more metadata.
Ultimately I will write an XML file next to the binary file in a new directory, where it will be loaded into the new system.
I have been playing around with Talend for a few days to evaluate whether it will be a useful tool or not... I am new to Talend, but have a Java background.
Thanks,
Rob
Anonymous
Not applicable
Author

Hi Rob,
Based on description above, I understood that major challenge is in extraction of ID which may or may not be available in file system. And you have strong business rules or definitions to get the ID. Once ID is extracted, further process is simple to you i.e. inner join in tmap to get the file ID from database...
For extraction of above id from file system, I would recommend to use tJavaRow and multiple if clauses based on your regular expressions implemented using Java. As you have a java background, this will not be difficult for you.
Once you have extracted ID from file system, you can use tMap to join with the database and continue with your further processes.
Please let me know if it helps and the understanding is the same as you desire.
Thanks
Vaibhav
_AnonymousUser
Specialist III
Specialist III

Bonjour,
SVP j'ai besoin de votre aide.
J'ai un fichier texte contenant des lignes de 250 caractères comme des relevés bancaires. J'ai besoin de lire le fichier par bloc.
Par exemple:

083005600V300026EUR2 0026000722405270614 270614VIREMENT SEPA RECU YCI5 0671 05067120
0530056 00026EUR2 0026000722405270614 NPYXXXXX
0530056 00026EUR2 0026000722405270614 LCC EUR DU 25/06/201
0530056 00026EUR2 0026000722405270614 LC24 ORE309W7001
0530056 00026EUR2 0026000722405270614 RCN550138032301

083005600HY00026EUR2 00260007224B1270614 270614PRLV DDDDDDDD XXXXXXX 06004160
0530056 00026EUR2 00260007224B1270614 NPYDD
0530056 00026EUR2 00260007224B1270614 NBEEDF
0530056 00026EUR2 00260007224B1270614 IJJJJJJJJJJJJJJJJJJJJJJ SESS
0530056 00026EUR2 00260007224B1270614 RUH YYYYYYYYYYYYYY RSSS

Je voudrai lire le fichier par partie, par exemple pour chaque ligne commençant par "08", je prends les lignes qui la suivent commençant par "05" jusqu'à arriver à la ligne "08" ainsi de suite.
Avez vous une idée SVP.
Anonymous
Not applicable
Author

Hi Souha,
This is an international forum and English is the language we use. Posting in English will allow you to get more visibility and more help. Thanks for your understanding!
Best regards
Sabrina
_AnonymousUser
Specialist III
Specialist III

Hi every one,
I need your help please,
I have text file(positionnel file) where each line have 250 characters, like that
083005600V300026EUR2 0026000eeeeee270614 270614VIREMENT eeeeeeee YCI5 0671 05067120
0530056 00026EUR2 0026000eeeeee270614 NPYXXXXX
0530056 00026EUR2 0026000eeeeee270614 eeeeeeeeeeeeeeeeeeeeeee
0530056 00026EUR2 0026000eeeeee270614 eeeeeeeeeeeeeeeeeeee
0530056 00026EUR2 0026000eeeeee270614 eeeeeeeeeeeeeeeeeeeeeee
083005600V300026EUR2 0026000eeeeee270614 270614VIREMENT eeeeeeee YCI5 0671 05067120
I would like to read my file like that:
IF the line starts with "08", I have to check the next line, if it is starting with "05" , a Msg Box having "NPYXXXXX " will be appeared
Else Msg Box having "VIREMENT " will be appeared.

Thanks for your help.
Anonymous
Not applicable
Author

Hi Souha,
Read your input file...
- Use tjavarow
- Use string handling left to get 2 chars in some variable from first column
- Compare these variables with 08 and 05 using if then else clause
- Execute whatever code you want...
I will not recommend to use the msg box, in place you use System.out.println()... else you will get 100s of msg boxes on screen and will not be able to identify what is happening...
Thanks
Vaibhav
_AnonymousUser
Specialist III
Specialist III

Thanks for your quickly reply sanvaibhav,
How can I compare two lines in the same time ??? I have to write a java code ?
Anonymous
Not applicable
Author

Hi Souou,
If you want to compare two lines at a time.. then you can think of using tMemorizeRow component... But, I don't think that you need to do that..
Check the use case in talend or refer to some blogs
https://help.talend.com/search/all?query=tMemorizeRows&content-lang=en
http://www.talendbyexample.com/talend-tmemorizerows-component-reference.html
Thanks
Vaibhav