Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Join us in Bucharest on Sept 18th for Qlik's AI Reality Tour! Register Now
cancel
Showing results for 
Search instead for 
Did you mean: 
jerownimow
Contributor III
Contributor III

tFilterRow - unable to catch the whole URL from tExtractXMLField

Hi Folks,

Good day!

 

I'm having a problem with filtering the data from tExtractXMLField using tFilterRow. Basically the flow looks like this:
   (Series of components) -> tExtractXMLField -> tFilterRow


The output records from Department field of tExtractXMLField were like this:
     <title xmlns="http://www.w3.org/2005/Atom" type="text">Termination Request Form (Responses)</title> 

     <title xmlns="http://www.w3.org/2005/Atom" type="text">New Hire Request Form (Responses)</title>

 What I want only to be returned were records having: *Termination Request Form (Responses)*
 
On my tFilterRow, I enabled the "Use  advanced mode" and tries different statements as below:

java.util.regex.Pattern.matches(".*Termination Request Form \\(Responses\\).*",row43.Department) 
".*Termination Request Form [(]Responses[)].*".matches(row43.Department)   

Result: Termination Request Form (Responses)
    - For the 1st 2 lines, I didn't get the expected output, it looked like it only got the exact words that I defined on the condition, it didn't considered the wildcard.

I also tried this:
"\"Termination Request Form (Responses)\"".matches(row43.Department)
java.util.regex.Pattern.compile(row43.Department).matcher(".*Termination Request Form (Responses).*").find()
row43.Department.contains("Termination Request Form (Responses)")
Result: No returned values

I even played with the functions: matches and contains, but it doesn't fixed my issue.

Can you help me fix this?
Thank you in advance.

Labels (6)
1 Solution

Accepted Solutions
Anonymous
Not applicable

Ah I see. The filter component isn't trimming that. Your tExtractXMLField component is only returning the value and not the element name. That is how it is supposed to work.

View solution in original post

9 Replies
manodwhb
Champion II
Champion II

@jerome29, in advance of TFilterRow,write expression in below way. since it was working.

 

row2.newColumn.contains("Termination Request Form (Responses)")

jerownimow
Contributor III
Contributor III
Author

@manodwhb,
thanks for your reply!
I already tried that as well, unfortunately it didn't give me the desired
output.
manodwhb
Champion II
Champion II

@jerome29,its working..since i have tested and which version of Talend are you using? also can you share tFilterRows settings?

Anonymous
Not applicable

I'm afraid you will need to give us examples of records that did and did not work. Your code (modified to try a few things on my machine)....

java.util.regex.Pattern.matches(".*Termination Request Form \\(Responses\\).*",value)

....will work as long as it is only the wildcards that change. 

 

If you could give some working and non working example Strings (exactly as they are returned) we might be able to help more

jerownimow
Contributor III
Contributor III
Author

Hi @rhall,@manodwhb

 

Thanks for your reply.

The expected output should be like this:
 <title xmlns="http://www.w3.org/2005/Atom" type="text">Termination Request Form (Responses)</title> 

While those codes is giving me this:
0683p000009Lzrr.png

The value is being trimmed.

Meanwhile, here's the config of my tFilterRow:
0683p000009LzVN.png

and I'm using talend 6.4.1
Thank you!

Anonymous
Not applicable

Ah I see. The filter component isn't trimming that. Your tExtractXMLField component is only returning the value and not the element name. That is how it is supposed to work.

jerownimow
Contributor III
Contributor III
Author

Seems that I miss some of the requirements, anyways, I now fully understood this, thank you so much for your replies. At the end of the day, I only got to filter those records, and it's working now.
Btw, one last question, is this the element name you're referring to?

<title xmlns="http://www.w3.org/2005/Atom" type="text">

And does the tExtractXMLField only returns the value? but as per checking on the tLogRow it also returns the element+value:
0683p000009M039.png

Anonymous
Not applicable

The element name is what you have in your code. Showing the whole element (complete structure) can be done if you tick "Get Node" and your column type is String. Normally "Get Node" would be used to retrieve a sub XML document and the type of column would be Document. If you do not have "Get Node" ticked, you will just be returned the data.

jerownimow
Contributor III
Contributor III
Author

Many thanks @rhall