Skip to main content
Announcements
See what Drew Clarke has to say about the Qlik Talend Cloud launch! READ THE BLOG
cancel
Showing results for 
Search instead for 
Did you mean: 
stucas
Contributor
Contributor

How to process anonymous JSON for Key Value pairs

Taking a simple JSON structure

{

  "593": "Guernsey",

  "594": "Jersey",

  "595": "Isle of Man",

  "599": "Unknown",

  "000": "United Kingdom",

  "800": "Virgin Islands (US)",

  "801": "British Antartic Territory",

  "802": "Guinea-Bissau"

}

I wish to convert this into a table structure of "Id", "caption"

Is there any way to achieve this simply using Talend components?

I have tried tExtractJSONFields but this requires that the key is known - and as the above example illustrates, my data source could have any key.

Other components exist in Exchange (but these seem to parse the above json as a single string)

Labels (3)
8 Replies
Anonymous
Not applicable

I've knocked up a very quick example of how you can do this. It uses a tJavaFlex and a tLogRow. The tJavaFlex to get the data and the tLogRow to simply print out the returned results. To use this in your job you will need to modify it, but everything you need is there.

 

First, in the tJavaFlex I use the following in the Star Code section.....

 

*******************************************************************

// start part of your Java code

String json = "{\r\n\r\n \"593\": \"Guernsey\",\r\n\r\n \"594\": \"Jersey\",\r\n\r\n \"595\": \"Isle of Man\",\r\n\r\n \"599\": \"Unknown\",\r\n\r\n \"000\": \"United Kingdom\",\r\n\r\n \"800\": \"Virgin Islands (US)\",\r\n\r\n \"801\": \"British Antartic Territory\",\r\n\r\n \"802\": \"Guinea-Bissau\"\r\n\r\n}";

 

JSONObject obj = new JSONObject(json);

 

java.util.Iterator<String> it = obj.keys();

 

while(it.hasNext()) {

*********************************************************************

 

That is your JSON example as a String and 3 lines of code to read it into a JSONObject, an Iterator to iterate over the JSONObject keys and a while loop to allow the processing of each item in the JSON.

 

The Main Code can be seen below. I have set up 2 columns (key and value) to output the records as they are found.

 

*********************************************************************

String key = it.next();

  String val = (String)obj.get(key);

   

  row1.key = key;

  row1.value = val;

*********************************************************************

 

This code reads the key and the value from the JSONObject and sets the columns for this row.

 

The End Code section just has a close to while loop.

 

*********************************************************************

}

*********************************************************************

 

When I run this, I get this response.....

 

Starting job TestJob at 17:18 02/12/2022.

[statistics] connecting to socket on port 3626

[statistics] connected

594|Jersey

000|United Kingdom

595|Isle of Man

599|Unknown

800|Virgin Islands (US)

801|British Antartic Territory

802|Guinea-Bissau

593|Guernsey

[statistics] disconnected

 

Job TestJob ended at 17:18 02/12/2022. [Exit code = 0]

stucas
Contributor
Contributor
Author

Really appreciate this - many thanks!

 

Obviously of course my issue is that simplifying a problem leads to more questions.

The source feed into the tJavaFlex is a data flow; eachr ronsisting of two columns, the table "name" and the json structure itself.

 

As this is on a per row basis - the tJavaFlex would need to have the JSONObject intantiated in the "Main Code" section - which means that somehow I would need to produce a row for every entry in the JSON.

 

The idea being to feed transformed json (id, caption) into a specified tHashMap that can use used in a lookup later in in the process.

 

I'm looking into how to do this now, but any ideas would be gratefully received.

Anonymous
Not applicable

I'm not sure how you are getting this data from your source, but I am assuming you have this sorted already. If that is the case, the only thing you need to do is use a tFlowToIterate component between your source and your tJavaFlex. The tFlowToIterate will allow you to save your two columns into the globalMap. The tFlowToIterate will link to your tJavaFlex using an Iterate link. This will essentially start the following components from the very beginning of the process for every iteration. You could use the globalMap value holding your JSON in the Start section of your tJavaFlex.

 

Now the tFlowToIterate will use this naming convention for the name of your globalMap key....

 

{row name}.{column name}

 

....so if your row is "row1" and your column name is "JSON", your globalMap value will be retrieved from the globalMap with code like this....

 

((String)globalMap.get("row1.JSON"))

 

This should get you started. You may discover a few more problems when you build through this. But give this a go and see how you get on.

jlolling
Creator III
Creator III

With Talend build-in components you cannot do that.

BUT with the component suite tJSONDoc* from Talend Exchange this is possible because the tJSONDocOutput component allows to set the attribute name freely. It is actually surprisingly easy to do.

jlolling
Creator III
Creator III

Check them out here. Take a look at the documentation

https://exchange.talend.com/#marketplaceproductoverview:marketplace=marketplace%252F1&p=marketplace%252F1%252Fproducts%252F1438&pi=marketplace%252F1%252Fproducts%252F1438%252Fitems%252F2999

jlolling
Creator III
Creator III

You could also do a lot more like merging documents based on attribute values (like foreign key relations).

stucas
Contributor
Contributor
Author

Ah - Interesting.

I might look at doing that - the imperative programmer in me wanted to keep it in a pipeline; so currently I've taken a string manipulation approach; not what I'd call efficient - but for the (many) small volumes of data I have it works well enough:

 

0695b00000aEjc7AAC.png 

In essence - I've called a rest api that returns a structure of many "tables". The tExtractJSONFields then converts the tables into columns that I then then "unpivot" into the structures that I named above (anonymously defined)

 

I then use a variation of the code you gave in a tJavaRow to "tokenise" the now columnised data:

----------------------------------------------------------------

String parsed = "";

JSONObject obj = new JSONObject(input_row.pivot_value);

java.util.Iterator<String> it = obj.keys();

while(it.hasNext()) {

String key = it.next();

String value = obj.get(key).toString();

if ((key.trim().length() > 0) && (value.trim().length() > 0)) {

String token = String.format("%s~%s@@", key, value);

parsed+=token;

}

}

output_row.name = input_row.pivot_key;

output_row.tokenised = parsed;

----------------------------------------------------------------

 

This is then parsed by a normalize function (which normalizes on the reformatted column) and is then passed to a tMap to write records to the appropriate target table using filters.

stucas
Contributor
Contributor
Author

Many thanks @Jan Lolling​ 

I will check these out - as the above solution I use is acceptable only for small simple "tables/documents" not some of the more complex ones which I was potentially dreading to workout.

 

When I come to these I'll investigate more.

 

Thanks for the heads up.