Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Join us in Bucharest on Sept 18th for Qlik's AI Reality Tour! Register Now
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

Extract sf:AccountId from following message String

Hi,

 

I have a below string message and in the output I want to extract sf:AccountId from message String:

 

[{"data":"<?xml version="1.0" encoding="UTF-8"?> <soapenv:Envelope xmlns:soapenv=" http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsd=" http://www.w3.org/2001/XMLSchema" xmlns:xsi=" http://www.w3.org/2001/XMLSchema-instance"> <soapenv:Body> <notifications xmlns=" http://soap.sforce.com/2005/09/outbound"> <OrganizationId>00D2F000000DGMUUA4</OrganizationId> <ActionId>04k2F0000004CGOQA2</ActionId> <SessionId xsi:nil="true"/> <EnterpriseUrl> https://behavecareus--dev.my.salesforce.com/services/Soap/c/44.0/00D2F000000DGMU</EnterpriseUrl> <PartnerUrl> https://behavecareus--dev.my.salesforce.com/services/Soap/u/44.0/00D2F000000DGMU</PartnerUrl> <Notification> <Id>04l2F000007wBMHQA2</Id> <sObject xsi:type="sf:Contact" xmlns:sf="urn: sobject.enterprise.soap.sforce.com"> <sf:Id>0032F00000KAlWDQA1</sf:Id> <sf:AccountId>0012F00000M5ACgQAN</sf:AccountId> </sObject> </Notification> </notifications> </soapenv:Body> </soapenv:Envelope>","type":"text/xml; charset=UTF-8"}]
 
I tried to get output with the following components:
tFileInputDelimited --> tLogRow --> tMap --> tLogRow
 
Can I get the help from anyone to get the proper output. 
Thank you in advance! 
Labels (5)
1 Solution

Accepted Solutions
Anonymous
Not applicable
Author

@pgajghate 

 

I hope you are looking for below solution 🙂

0683p000009M2hK.png

 

Below are the component screenshots. You need to read the record tFileInputFullRow so that entire data will be in a single line. The next step is tMap as shown below.

0683p000009M2hP.png

 

var1   ->         row1.line.indexOf("<?xml version=")
var2   ->         row1.line.indexOf("\",\"type")

xml_out ->     row1.line.substring(  Var.var1 ,Var.var2 )

Please note that var1 and var2 should be in integer format.

 

Convert the xml_output to Document format.

0683p000009M2ff.png

 

Connect it to tXMLMap and right click the xml as shown below to load the structure from a file. Please note that you need to copy the XML to a file so that you can select the file to create the structure automatically.

 

0683p000009M2hU.png

 

Now you have the entire structure ready in front of you and you just have to drag and drop to output.

0683p000009M2hZ.png

 And you have the output ready !!! 🙂

 

Please note that in the first tMap, if the search for XML did not get any result, job may fail. So make sure that all the records have this pattern or add the extra logic there (its a task for you 🙂 )

 

Hope I have answered your query.

 

Warm Regards,
Nikhil Thampi

Please appreciate our Talend community members by giving Kudos for sharing their time for your query. If your query is answered, please mark the topic as resolved 🙂

View solution in original post

10 Replies
Anonymous
Not applicable
Author

Hi,

 

    You are having an XML message inside a JSON array under an object field.

 

    Problem is that the double quotes without escape is breaking the record while reading the data.

 

    If you are going to follow the same method, you will need custom routines to parse the data from the input message. Could you please check with your source team whether its really required to complicate message like this?

 

Warm Regards,
Nikhil Thampi

Please appreciate our Talend community members by giving Kudos for sharing their time for your query. If your query is answered, please mark the topic as resolved 🙂

 

 

akumar2301
Specialist II
Specialist II

 Regular expression could help.Below is example with HTML response .

 

https://etladvisors.com/2014/01/07/extracting-data-via-http-part-1/

Anonymous
Not applicable
Author

Thank you for your reply Nikhil!

Can we use other components to extract the sf:AccountId?

Anonymous
Not applicable
Author

Thanks Abhishek! but the given link is not working. 

akumar2301
Specialist II
Specialist II

Looks like , They took it off because I checked the website 0683p000009MACn.png

 

if you will have only one sf:AccountId if your Input ,you could use tExtractRegexField and the string between <sf:AccountId>xxxxxxxxx</sf:AccountId>

 

try regex "<\\s*sf:AccountId[^>]*>(.*?)<\\s*/\\s*sf:AccountId>.*"

 

Otherwise, using tJavarow , get with regex Pattern get all <sf:AccountId>xxxxxxxxx</sf:AccountId> in a response 

 

so output of this

<sf:AccountId>xxxxxxxxx</sf:AccountId><sf:AccountId>xxxxxxxxx2</sf:AccountId>

 

then denormalize  into 

xxxxxxxxx

xxxxxxxxx2

 

etc.

 

Also 

 

 

Anonymous
Not applicable
Author

Thanks Abhishek!

I am getting an error. Can you give me a tjavarow code pls.

 

I am executing the below code: 

Pattern p = Pattern.compile("0012F00000M5ACgQAN");
Matcher m = p.matcher("0012F00000M5ACgQAN");
while(m.find()) {
System.out.println(m.group());
}

 

Output is:

[statistics] connecting to socket on port 3495
[statistics] connected
0012F00000M5ACgQAN
.----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------.
| tLogRow_3 |
|=--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------=|
|Column0 |
|=--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------=|
|[{"data":"<?xml version="1.0" encoding="UTF-8"?> <soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <soapenv:Body> <notifications xmlns="http://soap.sforce.com/2005/09/outbound"> <OrganizationId>00D2F000000DGMUUA4</OrganizationId> <ActionId>04k2F0000004CGOQA2</ActionId> <SessionId xsi:nil="true"/> <EnterpriseUrl>https://behavecareus--dev.my.salesforce.com/services/Soap/c/44.0/00D2F000000DGMU</EnterpriseUrl> <PartnerUrl>https://behavecareus--dev.my.salesforce.com/services/Soap/u/44.0/00D2F000000DGMU</PartnerUrl> <Notification> <Id>04l2F000007wBMHQA2</Id> <sObject xsi:type="sf:Contact" xmlns:sf="urn:sobject.enterprise.soap.sforce.com"> <sf:Id>0032F00000KAlWDQA1</sf:Id> <sf:AccountId>0012F00000M5ACgQAN</sf:AccountId> </sObject> </Notification> </notifications> </soapenv:Body> </soapenv:Envelope>","type":"text/xml; charset=UTF-8"}]|
'----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------'

.-------.
|tLogRow_4|
|=-----=|
|Column0|
|=-----=|
|null |
'-------'

[statistics] disconnected

Anonymous
Not applicable
Author

@pgajghate 

 

I hope you are looking for below solution 🙂

0683p000009M2hK.png

 

Below are the component screenshots. You need to read the record tFileInputFullRow so that entire data will be in a single line. The next step is tMap as shown below.

0683p000009M2hP.png

 

var1   ->         row1.line.indexOf("<?xml version=")
var2   ->         row1.line.indexOf("\",\"type")

xml_out ->     row1.line.substring(  Var.var1 ,Var.var2 )

Please note that var1 and var2 should be in integer format.

 

Convert the xml_output to Document format.

0683p000009M2ff.png

 

Connect it to tXMLMap and right click the xml as shown below to load the structure from a file. Please note that you need to copy the XML to a file so that you can select the file to create the structure automatically.

 

0683p000009M2hU.png

 

Now you have the entire structure ready in front of you and you just have to drag and drop to output.

0683p000009M2hZ.png

 And you have the output ready !!! 🙂

 

Please note that in the first tMap, if the search for XML did not get any result, job may fail. So make sure that all the records have this pattern or add the extra logic there (its a task for you 🙂 )

 

Hope I have answered your query.

 

Warm Regards,
Nikhil Thampi

Please appreciate our Talend community members by giving Kudos for sharing their time for your query. If your query is answered, please mark the topic as resolved 🙂

Anonymous
Not applicable
Author

Many Thanks Nikhil! Great explanation.

akumar2301
Specialist II
Specialist II

Above approach is better for current Request. Adding Just for Logging purpose.

How to achieve this with Regex Pattern

0683p000009M2iw.jpg