Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi Folks,
Good day!
I'm having a problem with filtering the data from tExtractXMLField using tFilterRow. Basically the flow looks like this:
(Series of components) -> tExtractXMLField -> tFilterRow
The output records from Department field of tExtractXMLField were like this:
<title xmlns="http://www.w3.org/2005/Atom" type="text">Termination Request Form (Responses)</title>
<title xmlns="http://www.w3.org/2005/Atom" type="text">New Hire Request Form (Responses)</title>
What I want only to be returned were records having: *Termination Request Form (Responses)*
On my tFilterRow, I enabled the "Use advanced mode" and tries different statements as below:
java.util.regex.Pattern.matches(".*Termination Request Form \\(Responses\\).*",row43.Department)
".*Termination Request Form [(]Responses[)].*".matches(row43.Department)
Result: Termination Request Form (Responses)
- For the 1st 2 lines, I didn't get the expected output, it looked like it only got the exact words that I defined on the condition, it didn't considered the wildcard.
I also tried this:
"\"Termination Request Form (Responses)\"".matches(row43.Department)
java.util.regex.Pattern.compile(row43.Department).matcher(".*Termination Request Form (Responses).*").find()
row43.Department.contains("Termination Request Form (Responses)")
Result: No returned values
I even played with the functions: matches and contains, but it doesn't fixed my issue.
Can you help me fix this?
Thank you in advance.
Ah I see. The filter component isn't trimming that. Your tExtractXMLField component is only returning the value and not the element name. That is how it is supposed to work.
@jerome29, in advance of TFilterRow,write expression in below way. since it was working.
row2.newColumn.contains("Termination Request Form (Responses)")
@jerome29,its working..since i have tested and which version of Talend are you using? also can you share tFilterRows settings?
I'm afraid you will need to give us examples of records that did and did not work. Your code (modified to try a few things on my machine)....
java.util.regex.Pattern.matches(".*Termination Request Form \\(Responses\\).*",value)
....will work as long as it is only the wildcards that change.
If you could give some working and non working example Strings (exactly as they are returned) we might be able to help more
Thanks for your reply.
The expected output should be like this:
<title xmlns="http://www.w3.org/2005/Atom" type="text">Termination Request Form (Responses)</title>
While those codes is giving me this:
The value is being trimmed.
Meanwhile, here's the config of my tFilterRow:
and I'm using talend 6.4.1
Thank you!
Ah I see. The filter component isn't trimming that. Your tExtractXMLField component is only returning the value and not the element name. That is how it is supposed to work.
Seems that I miss some of the requirements, anyways, I now fully understood this, thank you so much for your replies. At the end of the day, I only got to filter those records, and it's working now.
Btw, one last question, is this the element name you're referring to?
<title xmlns="http://www.w3.org/2005/Atom" type="text">
And does the tExtractXMLField only returns the value? but as per checking on the tLogRow it also returns the element+value:
The element name is what you have in your code. Showing the whole element (complete structure) can be done if you tick "Get Node" and your column type is String. Normally "Get Node" would be used to retrieve a sub XML document and the type of column would be Document. If you do not have "Get Node" ticked, you will just be returned the data.