Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hello everyone, I have a little problem:
I'm currently rebuilding an existing job in Talend. The problem is that data is retrieved from a REST API over a runtime of about 2 hours. If the REST API does not react for a short time, the job aborts.
Since the job is already runnable and in use I would not like to change the structure much so I wrote a Python script, which downloads the data in preliminary and summarizes it in a JSON. The JSON has the same structure as the answer of the REST API and can (theoretically) be used in the job without problems. I tried the setup with a small file (3 MB) and it works. But if I try to load the 200 MB file, it will load forever. I aborted the last try after 12 hours. (see the result in the attachment)
I also have to use the "tFileInputRaw", because the fields to extract are already filled in the "tExtractJsonField" and in "tFileInputJSON" I don't have the possibility with "Is Array".
I just can't imagine Talend not being able to read a 200 MB file. I wrote a script in Python for comparison and extracted multiple values from the JSON there, that didn't take 30 seconds.
I hope you can tell me how to solve this problem.
Best regards
Edit.: I tried it with different memory allocation, between 5 and 12 GB of 16 GB available memory.