Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Join us in NYC Sept 4th for Qlik's AI Reality Tour! Register Now
cancel
Showing results for 
Search instead for 
Did you mean: 
BooWseR
Contributor
Contributor

Read a 200 MB JSON File lasts forever.

Hello everyone, I have a little problem:

I'm currently rebuilding an existing job in Talend. The problem is that data is retrieved from a REST API over a runtime of about 2 hours. If the REST API does not react for a short time, the job aborts.

Since the job is already runnable and in use I would not like to change the structure much so I wrote a Python script, which downloads the data in preliminary and summarizes it in a JSON. The JSON has the same structure as the answer of the REST API and can (theoretically) be used in the job without problems. I tried the setup with a small file (3 MB) and it works. But if I try to load the 200 MB file, it will load forever. I aborted the last try after 12 hours. (see the result in the attachment)

I also have to use the "tFileInputRaw", because the fields to extract are already filled in the "tExtractJsonField" and in "tFileInputJSON" I don't have the possibility with "Is Array".

I just can't imagine Talend not being able to read a 200 MB file. I wrote a script in Python for comparison and extracted multiple values from the JSON there, that didn't take 30 seconds.

 

I hope you can tell me how to solve this problem.

Best regards

 

Edit.: I tried it with different memory allocation, between 5 and 12 GB of 16 GB available memory.

Labels (3)
1 Reply
Anonymous
Not applicable

If possible, split the file into multiple small file, iterate each file and pass the file path as parameter when calling Rest API several times in a Job, like:
tFileList--iterate--tRest-->....