Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi All,
I'm trying to retrieve below mongo document
{
"id": "5b69b66d3dae73000fa39440",
"data":"Testing"
"products": [{ "orderid":"1234"
}],
"createDate": "2018-08-07T15:10:37.570Z",
"updateDate": "2018-08-09T16:09:46.621Z"
}
I'm able to get Id,data but when i try to get products im unable to get the data. On top of it i'm getting products as below
[Document{{"orderid"="1234"}}] this is blocking me for parsing it as json. Can someone help. I think its a basic mistake but as i said im new to Talend OS for Big Data
I can't really say unless I have access to the MongoDB database you are using. Also, try using the 3.2 driver with your tMongoDBInput component. I have heard that is a good way of getting around some issues
Any update will be highly appriciated.
@Sujay wrote:
Hi All,
I'm trying to retrieve below mongo document
{
"id": "5b69b66d3dae73000fa39440","data":"Testing"
"products": [{ "orderid":"1234"
}],
"createDate": "2018-08-07T15:10:37.570Z",
"updateDate": "2018-08-09T16:09:46.621Z"
}
I'm able to get Id,data but when i try to get products im unable to get the data. On top of it i'm getting products as below
[Document{{"orderid"="1234"}}] this is blocking me for parsing it as json. Can someone help. I think its a basic mistake but as i said im new to Talend OS
@xdshi any help would be appreciated
You can do this using two tExtractJSONFields components. I have built an example job as shown below...
First of all, I should point out that your JSON was missing a comma before "products". I assumed that this was a mistake and I added it. I also added another record for products to test this solution. So the JSON I used for this is below...
{ "id": "5b69b66d3dae73000fa39440", "data":"Testing", "products": [{ "orderid":"1234"}, { "orderid":"1236"}], "createDate": "2018-08-07T15:10:37.570Z", "updateDate": "2018-08-09T16:09:46.621Z" }
The first tExtractJSONFields component extracts "id", "data", "createDate" and "updateDate" fields while passing through the JSON in its original state using a column called "newColumn". The JSON was originally supplied to the "newColumn" column and this is being used to pass it through the tExtractJSONFields_2 component....
The next component is the tExtractJSONFields_1 component. This will pass through all of the values retrieved by the tExtractJSONFields_2 component, then it extracts the products array. This can be seen below....
Given the example JSON I showed above, it outputs the following....
[statistics] connecting to socket on port 3491 [statistics] connected 5b69b66d3dae73000fa39440|Testing|2018-08-07T15:10:37.570Z|2018-08-09T16:09:46.621Z|1234 5b69b66d3dae73000fa39440|Testing|2018-08-07T15:10:37.570Z|2018-08-09T16:09:46.621Z|1236 [statistics] disconnected
Hi @rhall firstly i'm thrilled to have a response . Secondly as i stated above it's problem only when i try from TmongoIput. I was able to parse all the fileds using single jsonextract from json file. Below is the screenshot of the issue
If look carefully the products comes in such a way it wont parse using Jsonextract as it has attributes in key,value pair "orderActionID=abcd" but if you see actual collection it is just an array
{ "id": "5b69b66d3dae73000fa39440", "data":"Testing", "products": [
{ "orderactionId":"1234"}, { "productCode":"1236"}
], "createDate": "2018-08-07T15:10:37.570Z", "updateDate": "2018-08-09T16:09:46.621Z" }
Hope i'm making sense
Note: in Schema i tried string/object for products all gives me the same. In tmap im not doing anything, since there are lot of fields for showing actual problem in screenshot only few fields i used tmap.
I think (I don't use the tMongoInput very much and do not have mongo db on my machine) that you should be able to simply use the same technique as I used, but use the second tExtractJSONFields after the tMongoInput. You might need to send the "Products" field as a JSON document rather than the original document (which is what I did in my example), but you should still be able to pass the other fields through by naming them the same (in the same order) and leaving the JSONPaths blank in the tExtractJSONFields component.
Thanks @rhall maybe can you check within your internal teams who use Mongo or setup a local mongo for test purposes ?? I would like to fix this only from MongoInput.
As i was able to parse from file. But i would like to avoid the hops of bringing data from file and then parse and do transformations. Once again appreciate your time and efforts.
Have you tried the solution I gave using tMongoInput and a tExtractJsonFields component?
@rhall it will not work as i stated before as its not a json format (due to the document ). To justify it i'm pasting the error logs and screenshots. If you can help me fix why the retrieval of array from Mongo is showing in key,value pair and to avoid it i will be able to fix this problem. Parsing json is easy if i'm able to retrieve it properly from MongoDB.
logs:
Starting job mongo_document at 16:55 16/09/2019. [statistics] connecting to socket on port 3384 [statistics] connected tMongoDBInput_2 - The query does not contain any reference an index. [ _id, accountNumber, orderDate, market, updateDate, context, orderStatus, trackingStatus, createDate, orderType, products.productCode, orderId, writeBackAttemptDate, cancelReasonCode ] Expected to find an object with property ['products'] in path $ but found 'net.minidev.json.JSONArray'. This is not a json object according to the JsonProvider: 'com.jayway.jsonpath.spi.json.JsonSmartJsonProvider'. Expected to find an object with property ['products'] in path $ but found 'net.minidev.json.JSONArray'. This is not a json object according to the JsonProvider: 'com.jayway.jsonpath.spi.json.JsonSmartJsonProvider'. Expected to find an object with property ['products'] in path $ but found 'net.minidev.json.JSONArray'. This is not a json object according to the JsonProvider: 'com.jayway.jsonpath.spi.json.JsonSmartJsonProvider'. Expected to find an object with property ['products'] in path $ but found 'net.minidev.json.JSONArray'. This is not a json object according to the JsonProvider: 'com.jayway.jsonpath.spi.json.JsonSmartJsonProvider'. Expected to find an object with property ['products'] in path $ but found 'net.minidev.json.JSONArray'. This is not a json object according to the JsonProvider: 'com.jayway.jsonpath.spi.json.JsonSmartJsonProvider'. .------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------. | tLogRow_5 | |=-----------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------=| |_id |products | |=-----------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------=| |5b7afc6583131c000ecd3be1|[Document{{orderID=1111111,Code=1234455}}] | | '------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------' .---+--------. | tLogRow_2 | |=--+-------=| |_id|products| |=--+-------=| '---+--------' [statistics] disconnected Job mongo_document ended at 16:55 16/09/2019. [exit code=0]
Summarize: The problem is while retrieving data from Mongo collection data using tmongdbinput i get key value pairs from actual array.
In Mongo data:{"id": "5b69b66d3dae73000fa39440","products": [{"orderID": "118925871" }], "createDate": "2018-08-07T15:10:37.570Z", "updateDate": "2018-08-09T16:09:46.621Z"}
In talend:
_id|products
5b7afc6583131c000ecd3be1|[Document{{orderD=1234}}]
Hope this is clear. Problem is format.
You can trick your job into dealing with this. Notice that the products column from your tMongoDBInput returns this....
[Document{{orderID=1111111,Code=1234455}}]
The problem you have here is the "[Document{" and the "}]" at the end. If you put this through a tMap and replace the "[Document{" with "{"products":[" and replace the "}]" at the end with "]}", that will be a legitimate JSON document in its own right. When you pass that to the tExtractJSONFields component I showed you, that should work. You will need to pass through the other columns as I demonstrated.