<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Read a 200 MB JSON File lasts forever. in Talend Studio</title>
    <link>https://community.qlik.com/t5/Talend-Studio/Read-a-200-MB-JSON-File-lasts-forever/m-p/2261271#M42102</link>
    <description>&lt;P&gt;Hello everyone, I have a little problem:&lt;/P&gt; 
&lt;P&gt;I'm currently rebuilding an existing job in Talend. The problem is that data is retrieved from a REST API over a runtime of about 2 hours.&amp;nbsp;If the REST API does not react for a short time, the job aborts.&lt;/P&gt; 
&lt;P&gt;Since the job is already runnable and in use I would not like to change the structure much so I wrote a Python script, which downloads the data in preliminary and summarizes it in a JSON. The JSON has the same structure as the answer of the REST API and can (theoretically) be used in the job without problems.&amp;nbsp;I tried the setup with a small file (3 MB) and it works. But if I try to load the 200 MB file, it will load forever.&amp;nbsp;I aborted the last try after 12 hours. (see the result in the attachment)&lt;/P&gt; 
&lt;P&gt;I also have to use the "tFileInputRaw", because the fields to extract are already filled in the "tExtractJsonField" and in "tFileInputJSON" I don't have the possibility with "Is Array".&lt;/P&gt; 
&lt;P&gt;I just can't imagine Talend not being able to read a 200 MB file. I wrote a script in Python for comparison and extracted multiple values from the JSON there, that didn't take 30 seconds.&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;I hope you can tell me how to solve this problem.&lt;/P&gt; 
&lt;P&gt;Best regards&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;Edit.: I tried it with different memory allocation, between 5 and 12 GB of 16 GB available memory.&lt;/P&gt;</description>
    <pubDate>Sat, 16 Nov 2024 04:05:00 GMT</pubDate>
    <dc:creator>BooWseR</dc:creator>
    <dc:date>2024-11-16T04:05:00Z</dc:date>
    <item>
      <title>Read a 200 MB JSON File lasts forever.</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Read-a-200-MB-JSON-File-lasts-forever/m-p/2261271#M42102</link>
      <description>&lt;P&gt;Hello everyone, I have a little problem:&lt;/P&gt; 
&lt;P&gt;I'm currently rebuilding an existing job in Talend. The problem is that data is retrieved from a REST API over a runtime of about 2 hours.&amp;nbsp;If the REST API does not react for a short time, the job aborts.&lt;/P&gt; 
&lt;P&gt;Since the job is already runnable and in use I would not like to change the structure much so I wrote a Python script, which downloads the data in preliminary and summarizes it in a JSON. The JSON has the same structure as the answer of the REST API and can (theoretically) be used in the job without problems.&amp;nbsp;I tried the setup with a small file (3 MB) and it works. But if I try to load the 200 MB file, it will load forever.&amp;nbsp;I aborted the last try after 12 hours. (see the result in the attachment)&lt;/P&gt; 
&lt;P&gt;I also have to use the "tFileInputRaw", because the fields to extract are already filled in the "tExtractJsonField" and in "tFileInputJSON" I don't have the possibility with "Is Array".&lt;/P&gt; 
&lt;P&gt;I just can't imagine Talend not being able to read a 200 MB file. I wrote a script in Python for comparison and extracted multiple values from the JSON there, that didn't take 30 seconds.&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;I hope you can tell me how to solve this problem.&lt;/P&gt; 
&lt;P&gt;Best regards&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;Edit.: I tried it with different memory allocation, between 5 and 12 GB of 16 GB available memory.&lt;/P&gt;</description>
      <pubDate>Sat, 16 Nov 2024 04:05:00 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Read-a-200-MB-JSON-File-lasts-forever/m-p/2261271#M42102</guid>
      <dc:creator>BooWseR</dc:creator>
      <dc:date>2024-11-16T04:05:00Z</dc:date>
    </item>
    <item>
      <title>Re: Read a 200 MB JSON File lasts forever.</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Read-a-200-MB-JSON-File-lasts-forever/m-p/2261272#M42103</link>
      <description>If possible, split the file into multiple small file, iterate each file and pass the file path as parameter when calling Rest API several times in a Job, like: 
&lt;BR /&gt;tFileList--iterate--tRest--&amp;gt;.... 
&lt;BR /&gt;</description>
      <pubDate>Thu, 21 Nov 2019 02:45:21 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Read-a-200-MB-JSON-File-lasts-forever/m-p/2261272#M42103</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2019-11-21T02:45:21Z</dc:date>
    </item>
  </channel>
</rss>

