Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
I am currently trying to create an ingestion job workflow using kafka in Talend Studio. The job will read the json data in topic "work" and store into the hive table. My idea is to use the following workflow in Talend:
tKafKaInput > tLogRow > tJava > tMap
tKafKaInput and tLogRow : Consume the json data in topic "Work"
tJava : Fetch the json data and bring data to tMap
tMap : Structure the data and save into Hive table
Note : Snippet of json data in Kafka topic outputs from tLogRow_1 is as in attachment (data).
The code in tJava to fetch the json data is basically in this line in which its trying to catch "Vers" data from json:
String output=((String)globalMap.get("tLogRow_1_OUTPUT"));
JSONObject jsonObject = new JSONObject(output);
System.out.println(jsonObject);
String sourceDBName=(jsonObject.getString("Vers"));
However, I received the error as mentioned in attachment (Error).
My questions are:
Any helps if much appreciated, thanks.
Hi
tExtractJsonField is the best component used to extract data from a Json string. Please try it and let me know if it does not fit your need or you have any questions.
tKafKaInput > tExtractJsonField > tMap
Regards
Shong
Hi Shong, I have tried using tExtractJsonField as in below configuration:
Here, I loop the json path RequestHeader to fetch the data in it. However, once I run the job, there is no result from talend.
Hi
Set Loop Jsonpath query as "$.RequestHeader" and try again.
Regards
Shong
Hi Shong,
Currently I am trying to read the nested json from 2 parent json. As you can see here in image:
Currently, I able to read the json data from RequestHeader using tJsonExtractFields component (You can see at the main connection between 2 tJsonExtractFields components).
However, the 2nd component I can only do is the "Reject" connection, instead of Main/onComponentOk . Is this possible to read the nested data this way?
Or any idea on this. Thanks.
Reference I used : https://help.talend.com/r/Eizi~hPs0B4M_mO2ot6_1g/Ao7wb2mUfg1hug8GfwRXLw
Using a tReplicated after tKafkaInput to replicated the data flow so that you can read the json string several times. eg:
tKafkaInput--tReplicated--tExtractJsonField1
***************************** --tExtractJsonField2
Hi Shong,
is there any ways to merge those 2 component in TDF. Tried tUnite and tMaps, however that didnt worked well.
Store the results to thashOutput, read the data back from memory using tHashInput and merge the data in next subjob.