Skip to main content
Announcements
Introducing a new Enhanced File Management feature in Qlik Cloud! GET THE DETAILS!
cancel
Showing results for 
Search instead for 
Did you mean: 
changpinghsiao
Contributor
Contributor

How to use Talend Studio to parse a JSON file containing an array to multiple files contains each element in the array?

Hello:

We have a need to use Talend to parse a JSON file, like shown below and uploaded, and separate the three element (objects) and save them into separate files for further processing.

[

    {

       

"Employer"

: {

           

"PEXBatchID"

:

"1"

        },

       

"Employee"

: {

           

"appAccountID"

:

"11"

        }

    },

    {

       

"Employer"

: {

           

"PEXBatchID"

:

"2"

        },

       

"Employee"

: {

           

"appAccountID"

:

"22"

        }

    },

    {

       

"Employer"

: {

           

"PEXBatchID"

:

"3"

        },

       

"Employee"

: {

           

"appAccountID"

:

"33"

        }

    }

]

Using

tFileInputJSON

component, we can see there are three records, but we don't know how to save them into separate files.

0695b00000OBjbKAAT.jpg

[statistics] connecting to socket on port 4054

[statistics] connected

{"Employee":{"appAccountID":"11"},"Employer":{"PEXBatchID":"1"}}

{"Employee":{"appAccountID":"22"},"Employer":{"PEXBatchID":"2"}}

{"Employee":{"appAccountID":"33"},"Employer":{"PEXBatchID":"3"}}

[statistics] disconnected

Do any of you have any suggestions on how to achieve this? Your help is highly appreciated.

Yours,

Chang-Ping Hsiao

Labels (3)
1 Solution

Accepted Solutions
Anonymous
Not applicable

On tFixedFlowInput_2, get the current record, set the value of Data column as: (String)globalMap.get("row11.Data")

If the file name isn't important, you can generate a dynamic file name like this:

"d:/file/out"+((Integer)globalMap.get("tFlowToIterate_1_CURRENT_ITERATION"))+".json"

 

View solution in original post

7 Replies
Anonymous
Not applicable

Hi

You need to iterate each record using tFlowToIterate, the job looks like:

tFileInputJson--main(row1)--tFlowToiterate--iterate--tFixedFlowInput--main--tFileOutputJson

tFixedFlowInput: get the current record

 

what should be the output file name? Do you need some data like appAccountID (11) from the record will be part of file name?

 

Regards

Shong

changpinghsiao
Contributor
Contributor
Author

@Shicong Hong​ 

 

Thank you for your response and help. I will try your solution.

 

The output file name is not important here, as they will all be processed removed from the location when the job finishes. I was thinking of some auto-generated number, but getting specific data from each record for the filename would be nice.

 

Yours,

 

Chang-Ping Hsiao

changpinghsiao
Contributor
Contributor
Author

@Shicong Hong​ 

 

I tried your solution, but it doesn't work.

 

There is only 1 row to tFileOutputJson. See screenshot below.

 

0695b00000OBs2dAAD.jpg 

Configurations for all four components are also captured in the following screenshots.

0695b00000OBs37AAD.jpg0695b00000OBs3gAAD.jpg0695b00000OBs3qAAD.jpg0695b00000OBs4UAAT.jpg 

The output files (two, out-null.json and out-1.json, archived in out.zip file) I get only contain the following content.

 

 

{"data":[{"Data":null}]}

 

I don't know how to use the Mode in tFixedFlowInput, even after checking document online (at https://help.talend.com/r/en-US/7.3/tfixedflowinput/tfixedflowinput?tocId=f4ZiQ8GJByf7hZnDZVS8LA).

 

BTW, if you know it and you don't mind, could you please direct me to the information about the differences between the Main and Iterate rows? I found out some components don't work with some row types.

 

Thank you for your help.

 

Yours,

 

Chang-Ping Hsiao

Anonymous
Not applicable

On tFixedFlowInput_2, get the current record, set the value of Data column as: (String)globalMap.get("row11.Data")

If the file name isn't important, you can generate a dynamic file name like this:

"d:/file/out"+((Integer)globalMap.get("tFlowToIterate_1_CURRENT_ITERATION"))+".json"

 

changpinghsiao
Contributor
Contributor
Author

@Shicong Hong​ 

Thank you for showing me what to use to get data in tFixedFlowInput. I can see the data now, but only the last record.

 

As you can see below, there is only 1 row for tFileOutputJSON, and it's the last record. And the "((Integer)globalMap.get("tFlowToIterate_1_CURRENT_ITERATION"))" doesn't really get the iterate. The file name for out-"+((Integer)globalMap.get("tFlowToIterate_1_CURRENT_ITERATION"))+".json" is out-null.json.0695b00000OC6KCAA1.png

changpinghsiao
Contributor
Contributor
Author

@Shicong Hong​ 

I am not sure why, but using the following flow, I am able to generate 3 files now.

0695b00000OC7JgAAL.jpgThe data now look like below.

0695b00000OC7U3AAL.jpgFor out-1.json.

{

    "data": [

        {

            "Data": "{\"Employee\":{\"appAccountID\":\"11\"},\"Employer\":{\"PEXBatchID\":\"1\"}}"

        }

    ]

}

 

Thank you for your help.

 

Yours,

 

Chang-Ping Hsiao

Anonymous
Not applicable

tFlowToIterate_3 is the real component label in your case, you should change it based on the component label. Anyway, you got it working now.