<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: CSV file or Buffer memory, which is better to save mid data in the Job in Talend Studio</title>
    <link>https://community.qlik.com/t5/Talend-Studio/CSV-file-or-Buffer-memory-which-is-better-to-save-mid-data-in/m-p/2363992#M127794</link>
    <description>&lt;P&gt;Hi,&lt;/P&gt; 
&lt;P&gt;Due to the number of records, having multiple intermediate files may help if you can parallelize the operations you need to realize with these records.&lt;/P&gt; 
&lt;P&gt;Else,&amp;nbsp;having all the records in memory can generate memory issues but it depends most of the global data size than the number of records (are the records long or short?)&amp;nbsp;and of course of the&amp;nbsp;physical available memory.&lt;/P&gt; 
&lt;P&gt;Also, text (or CSV) file are processed very fast with standard tFileInputDelimited or tFileInputFullRow components, so you don't "really" have to worry about response time when using these components (in my opinion, except if you want to gain few seconds but I don't think&amp;nbsp;this is the first concern in your case).&lt;/P&gt; 
&lt;P&gt;Hope this helps.&lt;/P&gt;</description>
    <pubDate>Wed, 13 Dec 2017 18:05:34 GMT</pubDate>
    <dc:creator>TRF</dc:creator>
    <dc:date>2017-12-13T18:05:34Z</dc:date>
    <item>
      <title>CSV file or Buffer memory, which is better to save mid data in the Job</title>
      <link>https://community.qlik.com/t5/Talend-Studio/CSV-file-or-Buffer-memory-which-is-better-to-save-mid-data-in/m-p/2363991#M127793</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;Which is the best method to store mid data in the job, whether it is in &lt;STRONG&gt;csv&lt;/STRONG&gt; file or in&lt;STRONG&gt; buffer memory&lt;/STRONG&gt; (&lt;STRONG&gt;&lt;EM&gt;hashoutput&lt;/EM&gt;&lt;/STRONG&gt;).&lt;/P&gt; 
&lt;P&gt;In my scenario, I am getting 4.4 Million records from source and I need to do some operation with this. So I am storing data in the mid of the job because my job contains multiple sub jobs.&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;I am considering multiple perspective like performance, storage space and there should have any memory issue etc.&lt;/P&gt; 
&lt;P&gt;Please suggest me the best method to use.&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;Thanks in advance.&lt;/P&gt;</description>
      <pubDate>Sat, 16 Nov 2024 08:57:35 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/CSV-file-or-Buffer-memory-which-is-better-to-save-mid-data-in/m-p/2363991#M127793</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2024-11-16T08:57:35Z</dc:date>
    </item>
    <item>
      <title>Re: CSV file or Buffer memory, which is better to save mid data in the Job</title>
      <link>https://community.qlik.com/t5/Talend-Studio/CSV-file-or-Buffer-memory-which-is-better-to-save-mid-data-in/m-p/2363992#M127794</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt; 
&lt;P&gt;Due to the number of records, having multiple intermediate files may help if you can parallelize the operations you need to realize with these records.&lt;/P&gt; 
&lt;P&gt;Else,&amp;nbsp;having all the records in memory can generate memory issues but it depends most of the global data size than the number of records (are the records long or short?)&amp;nbsp;and of course of the&amp;nbsp;physical available memory.&lt;/P&gt; 
&lt;P&gt;Also, text (or CSV) file are processed very fast with standard tFileInputDelimited or tFileInputFullRow components, so you don't "really" have to worry about response time when using these components (in my opinion, except if you want to gain few seconds but I don't think&amp;nbsp;this is the first concern in your case).&lt;/P&gt; 
&lt;P&gt;Hope this helps.&lt;/P&gt;</description>
      <pubDate>Wed, 13 Dec 2017 18:05:34 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/CSV-file-or-Buffer-memory-which-is-better-to-save-mid-data-in/m-p/2363992#M127794</guid>
      <dc:creator>TRF</dc:creator>
      <dc:date>2017-12-13T18:05:34Z</dc:date>
    </item>
  </channel>
</rss>

