<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: tFileOutputDelimited - split output in multiple files and zip at the same time in Talend Studio</title>
    <link>https://community.qlik.com/t5/Talend-Studio/tFileOutputDelimited-split-output-in-multiple-files-and-zip-at/m-p/2297126#M69729</link>
    <description>&lt;P&gt;Hi Arne,&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;About the only practical way I can think to do this, would be to get the records from your source database in batches the size of each files you want to output, as follows:&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&lt;SPAN class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="BatchedZippedCSVOutput.png" style="width: 727px;"&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009LrRj.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/129161i694B1B006330B0F6/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009LrRj.png" alt="0683p000009LrRj.png" /&gt;&lt;/span&gt;&lt;/SPAN&gt;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;A quick query to get the record count:&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&lt;SPAN class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="tMySQLInput.png" style="width: 586px;"&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009LrRo.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/143801iD54B34F9D62AC681/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009LrRo.png" alt="0683p000009LrRo.png" /&gt;&lt;/span&gt;&lt;/SPAN&gt;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;And store this in a global variable for later:&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&lt;SPAN class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="tSetGlobalVar.png" style="width: 974px;"&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009LrOl.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/152360i39131C832425F767/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009LrOl.png" alt="0683p000009LrOl.png" /&gt;&lt;/span&gt;&lt;/SPAN&gt;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;Add a context&amp;nbsp;variable for the batch size, as it's&amp;nbsp;used in two components, and so should be maintained in one place:&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&lt;SPAN class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Context.png" style="width: 999px;"&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009Lr3t.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/133592i5D9819D36C0F708B/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009Lr3t.png" alt="0683p000009Lr3t.png" /&gt;&lt;/span&gt;&lt;/SPAN&gt;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;The tLoop in "for" mode allows us to iterate&amp;nbsp;and get the necessary offset for each batch of records:&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&lt;SPAN class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="tLoop.png" style="width: 524px;"&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009LrS8.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/137023i8E9BEBBA4ADA07E9/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009LrS8.png" alt="0683p000009LrS8.png" /&gt;&lt;/span&gt;&lt;/SPAN&gt;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;In the database&amp;nbsp;input component, we build a query&amp;nbsp;with the correct LIMIT and OFFSET (or whatever's appropriate for your DBMS):&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&lt;SPAN class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="tMySQLInput2.png" style="width: 591px;"&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009LrQ3.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/137837i446439A6F0194A8C/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009LrQ3.png" alt="0683p000009LrQ3.png" /&gt;&lt;/span&gt;&lt;/SPAN&gt;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;And then we simply output using&amp;nbsp;a tFileOutputDelimited, with zip compression enabled, and&amp;nbsp;a dynamic filename, in my case based on the record offset to keep it simple:&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&lt;SPAN class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="tFileOutputDelimited.png" style="width: 788px;"&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009LrS9.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/147626iD7AD6B1CB6AC7898/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009LrS9.png" alt="0683p000009LrS9.png" /&gt;&lt;/span&gt;&lt;/SPAN&gt;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;Giving us our&amp;nbsp;zipped CSV files:&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&lt;SPAN class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="CompressedCSVFiles.png" style="width: 579px;"&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009LrGO.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/140483i16AF9FB7A54F73B8/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009LrGO.png" alt="0683p000009LrGO.png" /&gt;&lt;/span&gt;&lt;/SPAN&gt;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;Regards,&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;Chris&lt;/P&gt;</description>
    <pubDate>Wed, 18 Oct 2017 17:28:22 GMT</pubDate>
    <dc:creator>Anonymous</dc:creator>
    <dc:date>2017-10-18T17:28:22Z</dc:date>
    <item>
      <title>tFileOutputDelimited - split output in multiple files and zip at the same time</title>
      <link>https://community.qlik.com/t5/Talend-Studio/tFileOutputDelimited-split-output-in-multiple-files-and-zip-at/m-p/2297125#M69728</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;I'm trying to unload a table with ~1.5 to 2 billion entries. The expected behaviour is: the data is going to be exported, the result is splitted in multiple CSV files AND the resulting files are going to be compressed at the same time in order to save disk space.&lt;/P&gt; 
&lt;P&gt;The tFileOutputDelimited has the feature to compress the resulting file OR split the result into mutliple files, but not both at the same time.&lt;/P&gt; 
&lt;P&gt;So my question is: how am I able to achieve splitting and compressing a file at the same time. Is there a way I can trigger another process at the moment one subset of rows has been written to a&amp;nbsp;file?&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;Best&lt;/P&gt; 
&lt;P&gt;Arne&lt;/P&gt;</description>
      <pubDate>Sat, 16 Nov 2024 09:10:06 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/tFileOutputDelimited-split-output-in-multiple-files-and-zip-at/m-p/2297125#M69728</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2024-11-16T09:10:06Z</dc:date>
    </item>
    <item>
      <title>Re: tFileOutputDelimited - split output in multiple files and zip at the same time</title>
      <link>https://community.qlik.com/t5/Talend-Studio/tFileOutputDelimited-split-output-in-multiple-files-and-zip-at/m-p/2297126#M69729</link>
      <description>&lt;P&gt;Hi Arne,&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;About the only practical way I can think to do this, would be to get the records from your source database in batches the size of each files you want to output, as follows:&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&lt;SPAN class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="BatchedZippedCSVOutput.png" style="width: 727px;"&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009LrRj.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/129161i694B1B006330B0F6/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009LrRj.png" alt="0683p000009LrRj.png" /&gt;&lt;/span&gt;&lt;/SPAN&gt;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;A quick query to get the record count:&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&lt;SPAN class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="tMySQLInput.png" style="width: 586px;"&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009LrRo.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/143801iD54B34F9D62AC681/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009LrRo.png" alt="0683p000009LrRo.png" /&gt;&lt;/span&gt;&lt;/SPAN&gt;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;And store this in a global variable for later:&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&lt;SPAN class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="tSetGlobalVar.png" style="width: 974px;"&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009LrOl.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/152360i39131C832425F767/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009LrOl.png" alt="0683p000009LrOl.png" /&gt;&lt;/span&gt;&lt;/SPAN&gt;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;Add a context&amp;nbsp;variable for the batch size, as it's&amp;nbsp;used in two components, and so should be maintained in one place:&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&lt;SPAN class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Context.png" style="width: 999px;"&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009Lr3t.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/133592i5D9819D36C0F708B/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009Lr3t.png" alt="0683p000009Lr3t.png" /&gt;&lt;/span&gt;&lt;/SPAN&gt;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;The tLoop in "for" mode allows us to iterate&amp;nbsp;and get the necessary offset for each batch of records:&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&lt;SPAN class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="tLoop.png" style="width: 524px;"&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009LrS8.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/137023i8E9BEBBA4ADA07E9/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009LrS8.png" alt="0683p000009LrS8.png" /&gt;&lt;/span&gt;&lt;/SPAN&gt;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;In the database&amp;nbsp;input component, we build a query&amp;nbsp;with the correct LIMIT and OFFSET (or whatever's appropriate for your DBMS):&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&lt;SPAN class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="tMySQLInput2.png" style="width: 591px;"&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009LrQ3.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/137837i446439A6F0194A8C/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009LrQ3.png" alt="0683p000009LrQ3.png" /&gt;&lt;/span&gt;&lt;/SPAN&gt;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;And then we simply output using&amp;nbsp;a tFileOutputDelimited, with zip compression enabled, and&amp;nbsp;a dynamic filename, in my case based on the record offset to keep it simple:&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&lt;SPAN class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="tFileOutputDelimited.png" style="width: 788px;"&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009LrS9.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/147626iD7AD6B1CB6AC7898/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009LrS9.png" alt="0683p000009LrS9.png" /&gt;&lt;/span&gt;&lt;/SPAN&gt;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;Giving us our&amp;nbsp;zipped CSV files:&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&lt;SPAN class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="CompressedCSVFiles.png" style="width: 579px;"&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009LrGO.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/140483i16AF9FB7A54F73B8/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009LrGO.png" alt="0683p000009LrGO.png" /&gt;&lt;/span&gt;&lt;/SPAN&gt;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;Regards,&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;Chris&lt;/P&gt;</description>
      <pubDate>Wed, 18 Oct 2017 17:28:22 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/tFileOutputDelimited-split-output-in-multiple-files-and-zip-at/m-p/2297126#M69729</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2017-10-18T17:28:22Z</dc:date>
    </item>
    <item>
      <title>Re: tFileOutputDelimited - split output in multiple files and zip at the same time</title>
      <link>https://community.qlik.com/t5/Talend-Studio/tFileOutputDelimited-split-output-in-multiple-files-and-zip-at/m-p/2297127#M69730</link>
      <description>&lt;P&gt;Thanks a lot for this&amp;nbsp;solution. I will give it a try. Unfortunately I have to retrieve the data from a old data sink which does not have any indices at all (and its a data view). So maybe the counting the objects takes a while.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 19 Oct 2017 08:49:01 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/tFileOutputDelimited-split-output-in-multiple-files-and-zip-at/m-p/2297127#M69730</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2017-10-19T08:49:01Z</dc:date>
    </item>
  </channel>
</rss>

