Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi,
I have a below string message and in the output I want to extract sf:AccountId from message String:
I hope you are looking for below solution 🙂
Below are the component screenshots. You need to read the record tFileInputFullRow so that entire data will be in a single line. The next step is tMap as shown below.
var1 -> row1.line.indexOf("<?xml version=") var2 -> row1.line.indexOf("\",\"type") xml_out -> row1.line.substring( Var.var1 ,Var.var2 )
Please note that var1 and var2 should be in integer format.
Convert the xml_output to Document format.
Connect it to tXMLMap and right click the xml as shown below to load the structure from a file. Please note that you need to copy the XML to a file so that you can select the file to create the structure automatically.
Now you have the entire structure ready in front of you and you just have to drag and drop to output.
And you have the output ready !!! 🙂
Please note that in the first tMap, if the search for XML did not get any result, job may fail. So make sure that all the records have this pattern or add the extra logic there (its a task for you 🙂 )
Hope I have answered your query.
Warm Regards,
Nikhil Thampi
Please appreciate our Talend community members by giving Kudos for sharing their time for your query. If your query is answered, please mark the topic as resolved 🙂
Hi,
You are having an XML message inside a JSON array under an object field.
Problem is that the double quotes without escape is breaking the record while reading the data.
If you are going to follow the same method, you will need custom routines to parse the data from the input message. Could you please check with your source team whether its really required to complicate message like this?
Warm Regards,
Nikhil Thampi
Please appreciate our Talend community members by giving Kudos for sharing their time for your query. If your query is answered, please mark the topic as resolved 🙂
Regular expression could help.Below is example with HTML response .
https://etladvisors.com/2014/01/07/extracting-data-via-http-part-1/
Thank you for your reply Nikhil!
Can we use other components to extract the sf:AccountId?
Thanks Abhishek! but the given link is not working.
Looks like , They took it off because I checked the website
if you will have only one sf:AccountId if your Input ,you could use tExtractRegexField and the string between <sf:AccountId>xxxxxxxxx</sf:AccountId>
try regex "<\\s*sf:AccountId[^>]*>(.*?)<\\s*/\\s*sf:AccountId>.*"
Otherwise, using tJavarow , get with regex Pattern get all <sf:AccountId>xxxxxxxxx</sf:AccountId> in a response
so output of this
<sf:AccountId>xxxxxxxxx</sf:AccountId><sf:AccountId>xxxxxxxxx2</sf:AccountId>
then denormalize into
xxxxxxxxx
xxxxxxxxx2
etc.
Also
Thanks Abhishek!
I am getting an error. Can you give me a tjavarow code pls.
I am executing the below code:
Pattern p = Pattern.compile("0012F00000M5ACgQAN");
Matcher m = p.matcher("0012F00000M5ACgQAN");
while(m.find()) {
System.out.println(m.group());
}
Output is:
[statistics] connecting to socket on port 3495
[statistics] connected
0012F00000M5ACgQAN
.----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------.
| tLogRow_3 |
|=--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------=|
|Column0 |
|=--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------=|
|[{"data":"<?xml version="1.0" encoding="UTF-8"?> <soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <soapenv:Body> <notifications xmlns="http://soap.sforce.com/2005/09/outbound"> <OrganizationId>00D2F000000DGMUUA4</OrganizationId> <ActionId>04k2F0000004CGOQA2</ActionId> <SessionId xsi:nil="true"/> <EnterpriseUrl>https://behavecareus--dev.my.salesforce.com/services/Soap/c/44.0/00D2F000000DGMU</EnterpriseUrl> <PartnerUrl>https://behavecareus--dev.my.salesforce.com/services/Soap/u/44.0/00D2F000000DGMU</PartnerUrl> <Notification> <Id>04l2F000007wBMHQA2</Id> <sObject xsi:type="sf:Contact" xmlns:sf="urn:sobject.enterprise.soap.sforce.com"> <sf:Id>0032F00000KAlWDQA1</sf:Id> <sf:AccountId>0012F00000M5ACgQAN</sf:AccountId> </sObject> </Notification> </notifications> </soapenv:Body> </soapenv:Envelope>","type":"text/xml; charset=UTF-8"}]|
'----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------'
.-------.
|tLogRow_4|
|=-----=|
|Column0|
|=-----=|
|null |
'-------'
[statistics] disconnected
I hope you are looking for below solution 🙂
Below are the component screenshots. You need to read the record tFileInputFullRow so that entire data will be in a single line. The next step is tMap as shown below.
var1 -> row1.line.indexOf("<?xml version=") var2 -> row1.line.indexOf("\",\"type") xml_out -> row1.line.substring( Var.var1 ,Var.var2 )
Please note that var1 and var2 should be in integer format.
Convert the xml_output to Document format.
Connect it to tXMLMap and right click the xml as shown below to load the structure from a file. Please note that you need to copy the XML to a file so that you can select the file to create the structure automatically.
Now you have the entire structure ready in front of you and you just have to drag and drop to output.
And you have the output ready !!! 🙂
Please note that in the first tMap, if the search for XML did not get any result, job may fail. So make sure that all the records have this pattern or add the extra logic there (its a task for you 🙂 )
Hope I have answered your query.
Warm Regards,
Nikhil Thampi
Please appreciate our Talend community members by giving Kudos for sharing their time for your query. If your query is answered, please mark the topic as resolved 🙂
Many Thanks Nikhil! Great explanation.
Above approach is better for current Request. Adding Just for Logging purpose.
How to achieve this with Regex Pattern