Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi All,
I am pulling data from a REST api and uploading the same in DataBase. For this I have used
tRESTClient-->tExtractJSONField-->tOutputOracle
This works fine, until the key part of (key:value) has $
Example JSON is for Chats. The chat might have attachments sometimes and there are extra elements to 'history' and JSON looks like
{
"chats":[
{
"id":"2001"
"comment": null,
"triggered_response": true,
"history": [
{
"name": "Guest1",
"msg_id": "1534494221340",
"msg": "Test message from Guest",
},
{
"name": "Support",
"msg_id": "1534494221340",
"msg": "Test message from support"
}
"attachment": {
"mime_type$string": "image/jpeg",
"name$string": "image.jpg",
"url$string": "image_url"
}
]
}
"count": 179049,
"next_url": "www.nexturl..."
}
This causes the tRestClient to break and gives following error
Exception in component tRESTClient_5 (Rest_API)
javax.ws.rs.client.ResponseProcessingException: Problem with reading the data, class org.dom4j.Document, ContentType: application/json;charset=utf-8.
[ERROR]: org.apache.cxf.jaxrs.utils.JAXRSUtils - Problem with reading the data, class org.dom4j.Document, ContentType: application/json;charset=utf-8.
at org.apache.cxf.jaxrs.impl.ResponseImpl.reportMessageHandlerProblem(ResponseImpl.java:446)
at org.apache.cxf.jaxrs.impl.ResponseImpl.doReadEntity(ResponseImpl.java:386)
at org.apache.cxf.jaxrs.client.AbstractClient.readBody(AbstractClient.java:528)
at org.apache.cxf.jaxrs.client.WebClient.handleResponse(WebClient.java:1126)
at org.apache.cxf.jaxrs.client.WebClient.doResponse(WebClient.java:1114)
at org.apache.cxf.jaxrs.client.WebClient.doChainedInvocation(WebClient.java:1039)
at org.apache.cxf.jaxrs.client.WebClient.doInvoke(WebClient.java:894)
at org.apache.cxf.jaxrs.client.WebClient.doInvoke(WebClient.java:862)
at org.apache.cxf.jaxrs.client.WebClient.invoke(WebClient.java:427)
at org.apache.cxf.jaxrs.client.WebClient.get(WebClient.java:607)
at dataland_data_import.rest_api_0_1.Rest_API.tRESTClient_5Process(Rest_API.java:856)
at dataland_data_import.rest_api_0_1.Rest_API.runJobInTOS(Rest_API.java:2435)
at dataland_data_import.rest_api_0_1.Rest_API.main(Rest_API.java:2284)
Caused by: javax.ws.rs.BadRequestException: HTTP 400 Bad Request
at org.apache.cxf.jaxrs.utils.SpecExceptions.toBadRequestException(SpecExceptions.java:84)
at org.apache.cxf.jaxrs.utils.ExceptionUtils.toBadRequestException(ExceptionUtils.java:121)
at org.apache.cxf.jaxrs.provider.json.JSONProvider.readFrom(JSONProvider.java:268)
at org.apache.cxf.jaxrs.provider.dom4j.DOM4JProvider.readFrom(DOM4JProvider.java:79)
at dataland_data_import.rest_api_0_1.Rest_API$2.readFrom(Rest_API.java:682)
at org.apache.cxf.jaxrs.provider.dom4j.DOM4JProvider.readFrom(DOM4JProvider.java:47)
at org.apache.cxf.jaxrs.utils.JAXRSUtils.readFromMessageBodyReader(JAXRSUtils.java:1379)
at org.apache.cxf.jaxrs.impl.ResponseImpl.doReadEntity(ResponseImpl.java:377)
... 11 more
Caused by: org.w3c.dom.DOMException: INVALID_CHARACTER_ERR: An invalid or illegal XML character is specified.
at com.sun.org.apache.xerces.internal.dom.CoreDocumentImpl.checkQName(CoreDocumentImpl.java:2603)
at com.sun.org.apache.xerces.internal.dom.ElementNSImpl.setName(ElementNSImpl.java:121)
at com.sun.org.apache.xerces.internal.dom.ElementNSImpl.<init>(ElementNSImpl.java:84)
at com.sun.org.apache.xerces.internal.dom.CoreDocumentImpl.createElementNS(CoreDocumentImpl.java:2121)
at org.apache.cxf.staxutils.W3CDOMStreamWriter.createElementNS(W3CDOMStreamWriter.java:155)
at org.apache.cxf.staxutils.W3CDOMStreamWriter.createAndAddElement(W3CDOMStreamWriter.java:161)
at org.apache.cxf.staxutils.W3CDOMStreamWriter.writeStartElement(W3CDOMStreamWriter.java:108)
at org.apache.cxf.staxutils.StaxUtils.writeStartElement(StaxUtils.java:832)
at org.apache.cxf.staxutils.StaxUtils.copy(StaxUtils.java:760)
at org.apache.cxf.staxutils.StaxUtils.copy(StaxUtils.java:724)
at org.apache.cxf.jaxrs.provider.json.JSONProvider.copyReaderToWriter(JSONProvider.java:411)
at org.apache.cxf.jaxrs.provider.json.JSONProvider.readFrom(JSONProvider.java:227)
... 16 more
any suggestions how to handle this 😞
OK, I have figured out what is going wrong and unfortunately there isn't really a *nice* fix to it. You need to parse this like XML (using XPath) because JsonPath irritatingly does not support the parent operator (which I agree is annoying). The problem you have is that XML does not allow element names with $ characters in them. Which kind of leaves you in limbo. What you can do is put a tJavaFlex or tJavaRow between your tRestClient and your tExtractJson component and replace any $ characters with another character (maybe a "_"). I did this to try it out and it worked perfectly.
I agree it is irritating, but this isn't really a Talend bug. It is more of a flaw of the flexibility (ie allowing your to parse Json as XML).
The code I added can be seen below....
row1.string = string.replace('$', '_');
I hope this helps.
Do you have the "Convert Response to DOM Document" option ticked in the Advanced Settings tab of the tRestClient? If so, try unticking it and running.
I don't think /chats/history[*] is correct. For a start, chats is an array. You probably need to expand on the xpath "/chats" and assign the "history" tag to a column, then expand that on "/" and pull out the "name", "msg_id", and "msg" elements.
You might be able to iterate directly using something like "/chats[*]/history", but I think I might be mixing XPath and JsonPath terminology there!
Read the file using JsonPath. If you're working with JSON it makes sense to use JSON tools. Your loop should be something like ....
"$.chats.history[*]"
HiPhilHibbs,
looping on element is not a big issues here for now, the main issue that I am facing is tExtractJSONElements not being able to parse/extract information from tRestClient because there is $ in the one of the key of key:value pair.
"attachment": {
"mime_type$string": "image/jpeg",
"name$string": "image.jpg",
"url$string": "image_url"
}
A side note Loop XPath Query : "/chats/history[*]" did work as expected and I got expected result
ChatID History_Name History_Msg
2001 Guest1 Test message from Guest
2001 Support Test message from support
and so on..
Regarding the reply from rhall_2_0
I did try with reading the file with JSON Path and used the Loop JSON Query, but the issue here is I get the values of looping element but not the parent element
Something like this
ChatID History_Name History_Msg
NULL Guest1 Test message from Guest
NULL Support Test message from support
I just tried to quickly try something out and it showed that the JSON you've given us is poorly formatted. Should it be more like this?
{ "chats":[ { "id":"2001", "comment":null, "triggered_response":true, "history":[ { "name":"Guest1", "msg_id":"1534494221340", "msg":"Test message from Guest" }, { "name":"Support", "msg_id":"1534494221340", "msg":"Test message from support" } ], "attachment":[ { "mime_type$string":"image/jpeg", "name$string":"image.jpg", "url$string":"image_url" } ] } ], "count":179049, "next_url":"www.nexturl..." }
Ah. I had that exact problem with a key that had a "." in it, I couldn't parse it using JsonPath so I had to switch to XPath. Maybe you are having the opposite problem, that XPath can't parse it because of the "$". Try switching to JsonPath!
And hope that you don't get a tag that has both a "$" and a "."...
@rhall -Yes the format you mentioned is correct.
@PhilHibbs Yes I tried with JSON Path rather than XPath and used the Loop JSON Query, but the issue here is I get the values of looping element but not the parent element
Something like this
ChatID History_Name History_Msg
NULL Guest1 Test message from Guest
NULL Support Test message from support
How about extracting the JSON in two stages - one to expand the "/chats[*]" and pull out the "id", "attachment", and the "history", then a second stage that expands the "history" with a Loop query of "$" (or "/" for Jsonpath, as there will be no troublesome $s here ) and pulls out the history elements.