<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Job optimization - Excel file reading in Talend Studio</title>
    <link>https://community.qlik.com/t5/Talend-Studio/Job-optimization-Excel-file-reading/m-p/2204301#M5236</link>
    <description>&lt;P&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;Hello all,&lt;/FONT&gt;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;--- Short version ---&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;Encountering a massive optimisation problem when reading multiple time Excel 2007 file which causes outOfMemory error.&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;How to increase performance, by only reading the Excel File 1 time per Sheet file rather than 1 time per table to insert ?&lt;/FONT&gt;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;--- The long version ---&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;I'm encountering a problem with a "out of memory" error in Talend. I solved this problem, by increasing xmx and xms of my .ini file.&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;But now this is back, because my program will not be launched from my computer, but from a server which uses only half of the space I defined on my local machine.&lt;/FONT&gt;&lt;/P&gt; 
&lt;P&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;There's no possibility to increase xmx and xms from the server. So my only way is to optimize my code (in my opinion, but if you have a genius idea I'll take it &lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009MA9p.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/138034i5F552429DA646D6F/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009MA9p.png" alt="0683p000009MA9p.png" /&gt;&lt;/span&gt; )&lt;/FONT&gt;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;My jobs are doing this :&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;- Reading an Excel 2007 file (.xlsx)&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;- Transforming some data to correspond with management rules&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;- Inserting in a table&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;- Committing&lt;/FONT&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;First optimization problem : &lt;/FONT&gt;&lt;/P&gt; 
&lt;P&gt;&lt;BR /&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;I'm reading the Excel file at the beginning of each job.&lt;/FONT&gt;&lt;/P&gt; 
&lt;P&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;In my file, I have columns, let's say A,B,C,D and E.&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;I need to insert the data from A,B,C into tableA.&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;Then inserting data from D into tableB.&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;And inserting data from E into tableC.&lt;/FONT&gt;&lt;/P&gt; 
&lt;P&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;This order cannot be changed, as long as I have foreign constraint in my database I need to insert all the data in a specific order.&lt;/FONT&gt;&lt;/P&gt; 
&lt;P&gt;&lt;BR /&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;For the rest of my program, I have many others tables, with many columns, and need to do the same ; that's why I need to optimize the process.&lt;/FONT&gt;&lt;/P&gt; 
&lt;P&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;So I'm reading the Excel file, I'm inserting into TableA ; I'm calling a subjob which insert data from column D into tableB, and then another subjob which insert col E into tableC.&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;At the end I got 17 jobs, with 17 same excel file reading (but not the same sheets).&lt;/FONT&gt;&lt;/P&gt; 
&lt;P&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;See the attachments with these 3 jobs.&lt;/FONT&gt;&lt;/P&gt; 
&lt;P&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;&lt;SPAN class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="First job" style="width: 999px;"&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009M72o.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/134446i04A5CDB3E23BD8E1/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009M72o.png" alt="0683p000009M72o.png" /&gt;&lt;/span&gt;&lt;SPAN class="lia-inline-image-caption" onclick="event.preventDefault();"&gt;First job&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/P&gt; 
&lt;P&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;---&lt;/FONT&gt;&lt;/P&gt; 
&lt;P&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;&lt;SPAN class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Second job" style="width: 999px;"&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009M6oz.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/133055i573DCAA184F9DDEA/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009M6oz.png" alt="0683p000009M6oz.png" /&gt;&lt;/span&gt;&lt;SPAN class="lia-inline-image-caption" onclick="event.preventDefault();"&gt;Second job&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/P&gt; 
&lt;P&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;---&lt;/FONT&gt;&lt;/P&gt; 
&lt;P&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;&lt;SPAN class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Third job" style="width: 999px;"&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009M6Wx.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/152690i2A8F1191DAA1F794/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009M6Wx.png" alt="0683p000009M6Wx.png" /&gt;&lt;/span&gt;&lt;SPAN class="lia-inline-image-caption" onclick="event.preventDefault();"&gt;Third job&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/P&gt; 
&lt;P&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;---&lt;/FONT&gt;&lt;/P&gt; 
&lt;P&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;How can I optimize this pattern ?&lt;/FONT&gt;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;I already tried multiple solution :&lt;/FONT&gt;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;&lt;STRONG&gt;First&lt;/STRONG&gt;, rather than insert into tableA then tableB then tableC, I used a tMap with Main order 1 to insert into tableA, Main order 2 tableB and Main order3 tableC.&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;But it doesn't respect the "order", Talend is just parallelizing the flow, so my job is inserting B before A and C, which causes error in database due to foreign key constraint.&lt;/FONT&gt;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;&lt;STRONG&gt;Second&lt;/STRONG&gt;, I used the "tReplicate" component, to multiply by 3 the input flow, it allows me to avoid to read the Excel File 3 times ; but the behavior is the same as tMap, the "order" is not respected, and I have the same issue as tMap...&lt;/FONT&gt;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;&lt;STRONG&gt;Third&lt;/STRONG&gt;, I grouped my code with my 3 tables A,B and C into the same job, so I'm avoiding the multiple connection to database and disconnection.&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;Committing insert without ending the connection, it works on my machine ; but when I'm building the job and launching the .sh I got the error "outOfMemory".&lt;/FONT&gt;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;&lt;STRONG&gt;Fourth&lt;/STRONG&gt;, same as third solution, but I grouped my code with the component tFileExcelInput, and not the actual components tFileExcelWorkbookOpen -&amp;gt; tFileExcelSheetInput&lt;/FONT&gt;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;Right now I don't really know what to do, I heard about "tHashMap" and "tFlowToIterate/tFixedFlowInput" but never used this. Do you think it might be a good alternative ?&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;I have around 3 tLogRow per jobs, this might be a bad habit too ? &lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;And all the console is exported into a "log file" into a specific folder.&lt;/FONT&gt;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;If you have an idea, whatever passing through your mind, I'll be glad to hear it and test it ASAP ! All the optimization I can get, I take it !&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;Thanks by advance, if you need more details, tell me and I'll add more screenshots etc...&lt;/FONT&gt;&lt;/P&gt;</description>
    <pubDate>Tue, 20 Aug 2019 13:24:35 GMT</pubDate>
    <dc:creator>jeoste</dc:creator>
    <dc:date>2019-08-20T13:24:35Z</dc:date>
    <item>
      <title>Job optimization - Excel file reading</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Job-optimization-Excel-file-reading/m-p/2204301#M5236</link>
      <description>&lt;P&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;Hello all,&lt;/FONT&gt;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;--- Short version ---&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;Encountering a massive optimisation problem when reading multiple time Excel 2007 file which causes outOfMemory error.&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;How to increase performance, by only reading the Excel File 1 time per Sheet file rather than 1 time per table to insert ?&lt;/FONT&gt;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;--- The long version ---&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;I'm encountering a problem with a "out of memory" error in Talend. I solved this problem, by increasing xmx and xms of my .ini file.&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;But now this is back, because my program will not be launched from my computer, but from a server which uses only half of the space I defined on my local machine.&lt;/FONT&gt;&lt;/P&gt; 
&lt;P&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;There's no possibility to increase xmx and xms from the server. So my only way is to optimize my code (in my opinion, but if you have a genius idea I'll take it &lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009MA9p.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/138034i5F552429DA646D6F/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009MA9p.png" alt="0683p000009MA9p.png" /&gt;&lt;/span&gt; )&lt;/FONT&gt;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;My jobs are doing this :&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;- Reading an Excel 2007 file (.xlsx)&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;- Transforming some data to correspond with management rules&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;- Inserting in a table&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;- Committing&lt;/FONT&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;First optimization problem : &lt;/FONT&gt;&lt;/P&gt; 
&lt;P&gt;&lt;BR /&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;I'm reading the Excel file at the beginning of each job.&lt;/FONT&gt;&lt;/P&gt; 
&lt;P&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;In my file, I have columns, let's say A,B,C,D and E.&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;I need to insert the data from A,B,C into tableA.&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;Then inserting data from D into tableB.&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;And inserting data from E into tableC.&lt;/FONT&gt;&lt;/P&gt; 
&lt;P&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;This order cannot be changed, as long as I have foreign constraint in my database I need to insert all the data in a specific order.&lt;/FONT&gt;&lt;/P&gt; 
&lt;P&gt;&lt;BR /&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;For the rest of my program, I have many others tables, with many columns, and need to do the same ; that's why I need to optimize the process.&lt;/FONT&gt;&lt;/P&gt; 
&lt;P&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;So I'm reading the Excel file, I'm inserting into TableA ; I'm calling a subjob which insert data from column D into tableB, and then another subjob which insert col E into tableC.&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;At the end I got 17 jobs, with 17 same excel file reading (but not the same sheets).&lt;/FONT&gt;&lt;/P&gt; 
&lt;P&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;See the attachments with these 3 jobs.&lt;/FONT&gt;&lt;/P&gt; 
&lt;P&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;&lt;SPAN class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="First job" style="width: 999px;"&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009M72o.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/134446i04A5CDB3E23BD8E1/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009M72o.png" alt="0683p000009M72o.png" /&gt;&lt;/span&gt;&lt;SPAN class="lia-inline-image-caption" onclick="event.preventDefault();"&gt;First job&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/P&gt; 
&lt;P&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;---&lt;/FONT&gt;&lt;/P&gt; 
&lt;P&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;&lt;SPAN class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Second job" style="width: 999px;"&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009M6oz.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/133055i573DCAA184F9DDEA/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009M6oz.png" alt="0683p000009M6oz.png" /&gt;&lt;/span&gt;&lt;SPAN class="lia-inline-image-caption" onclick="event.preventDefault();"&gt;Second job&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/P&gt; 
&lt;P&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;---&lt;/FONT&gt;&lt;/P&gt; 
&lt;P&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;&lt;SPAN class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Third job" style="width: 999px;"&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009M6Wx.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/152690i2A8F1191DAA1F794/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009M6Wx.png" alt="0683p000009M6Wx.png" /&gt;&lt;/span&gt;&lt;SPAN class="lia-inline-image-caption" onclick="event.preventDefault();"&gt;Third job&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/P&gt; 
&lt;P&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;---&lt;/FONT&gt;&lt;/P&gt; 
&lt;P&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;How can I optimize this pattern ?&lt;/FONT&gt;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;I already tried multiple solution :&lt;/FONT&gt;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;&lt;STRONG&gt;First&lt;/STRONG&gt;, rather than insert into tableA then tableB then tableC, I used a tMap with Main order 1 to insert into tableA, Main order 2 tableB and Main order3 tableC.&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;But it doesn't respect the "order", Talend is just parallelizing the flow, so my job is inserting B before A and C, which causes error in database due to foreign key constraint.&lt;/FONT&gt;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;&lt;STRONG&gt;Second&lt;/STRONG&gt;, I used the "tReplicate" component, to multiply by 3 the input flow, it allows me to avoid to read the Excel File 3 times ; but the behavior is the same as tMap, the "order" is not respected, and I have the same issue as tMap...&lt;/FONT&gt;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;&lt;STRONG&gt;Third&lt;/STRONG&gt;, I grouped my code with my 3 tables A,B and C into the same job, so I'm avoiding the multiple connection to database and disconnection.&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;Committing insert without ending the connection, it works on my machine ; but when I'm building the job and launching the .sh I got the error "outOfMemory".&lt;/FONT&gt;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;&lt;STRONG&gt;Fourth&lt;/STRONG&gt;, same as third solution, but I grouped my code with the component tFileExcelInput, and not the actual components tFileExcelWorkbookOpen -&amp;gt; tFileExcelSheetInput&lt;/FONT&gt;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;Right now I don't really know what to do, I heard about "tHashMap" and "tFlowToIterate/tFixedFlowInput" but never used this. Do you think it might be a good alternative ?&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;I have around 3 tLogRow per jobs, this might be a bad habit too ? &lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;And all the console is exported into a "log file" into a specific folder.&lt;/FONT&gt;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;If you have an idea, whatever passing through your mind, I'll be glad to hear it and test it ASAP ! All the optimization I can get, I take it !&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;Thanks by advance, if you need more details, tell me and I'll add more screenshots etc...&lt;/FONT&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 20 Aug 2019 13:24:35 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Job-optimization-Excel-file-reading/m-p/2204301#M5236</guid>
      <dc:creator>jeoste</dc:creator>
      <dc:date>2019-08-20T13:24:35Z</dc:date>
    </item>
    <item>
      <title>Re: Job optimization - Excel file reading</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Job-optimization-Excel-file-reading/m-p/2204302#M5237</link>
      <description>&lt;P&gt;I am unsure about a few things, can you clarify please?&lt;/P&gt; 
&lt;P&gt;1) Are the jobs called by subsequent jobs? So does job 1 start job 2 using a tRunJob and does job 2 start job 3 using a tRunJob?&lt;/P&gt; 
&lt;P&gt;2) What components are you using? I can't quite make them out from your images. You appear to maybe have an Excel component iterating straight to another file component. This looks weird.&lt;/P&gt; 
&lt;P&gt;3) How are the components you are iterating to configured? How are you passing the data and are they set to be created brand new on each iteration?&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;If you could answer the above it should give me a better idea. It may also lead to more questions though &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 20 Aug 2019 14:59:01 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Job-optimization-Excel-file-reading/m-p/2204302#M5237</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2019-08-20T14:59:01Z</dc:date>
    </item>
    <item>
      <title>Re: Job optimization - Excel file reading</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Job-optimization-Excel-file-reading/m-p/2204303#M5238</link>
      <description>&lt;P&gt;Sorry, I just saw the tags on your last post. That answers the component questions.&amp;nbsp;&lt;A href="https://community.qlik.com/s/profile/0053p000007LKk8AAG"&gt;@lli&lt;/A&gt;&amp;nbsp;may be able to advise as I believe you are using his components. Could you please answer the other questions?&lt;/P&gt;</description>
      <pubDate>Tue, 20 Aug 2019 15:00:48 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Job-optimization-Excel-file-reading/m-p/2204303#M5238</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2019-08-20T15:00:48Z</dc:date>
    </item>
    <item>
      <title>Re: Job optimization - Excel file reading</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Job-optimization-Excel-file-reading/m-p/2204304#M5239</link>
      <description>&lt;P&gt;An observation, you appear to be using OnComponentOK links where it is possibly better to be using OnSubJobOK links. You certainly need to adjust this. At the moment the flow of your job is very difficult to figure out&lt;/P&gt;</description>
      <pubDate>Tue, 20 Aug 2019 15:04:46 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Job-optimization-Excel-file-reading/m-p/2204304#M5239</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2019-08-20T15:04:46Z</dc:date>
    </item>
    <item>
      <title>Re: Job optimization - Excel file reading</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Job-optimization-Excel-file-reading/m-p/2204305#M5240</link>
      <description>&lt;P&gt;Thanks for your answer !&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;Sure let me clarify your questions&lt;/P&gt; 
&lt;BLOCKQUOTE&gt; 
 &lt;HR /&gt; 
 &lt;A href="https://community.qlik.com/s/profile/005390000069RuGAAU"&gt;@rhall&lt;/A&gt;&amp;nbsp;wrote: 
 &lt;BR /&gt; 
 &lt;P&gt;&amp;nbsp;&lt;/P&gt; 
 &lt;P&gt;1) Are the jobs called by subsequent jobs? So does job 1 start job 2 using a tRunJob and does job 2 start job 3 using a tRunJob?&lt;/P&gt; 
 &lt;HR /&gt; 
&lt;/BLOCKQUOTE&gt; 
&lt;P&gt;That's right, my first job is starting job 2 (which is called Subjob "sports"), if the component "Commit t1" succeeds.&lt;/P&gt; 
&lt;P&gt;Then job 2 is starting job 3 etc...&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;BLOCKQUOTE&gt; 
 &lt;HR /&gt; 
 &lt;A href="https://community.qlik.com/s/profile/005390000069RuGAAU"&gt;@rhall&lt;/A&gt;&amp;nbsp;wrote: 
 &lt;BR /&gt; 
 &lt;P&gt;2) What components are you using? I can't quite make them out from your images. You appear to maybe have an Excel component iterating straight to another file component. This looks weird.&lt;/P&gt; 
 &lt;HR /&gt; 
&lt;/BLOCKQUOTE&gt; 
&lt;P&gt;These components comes from the Talend Exchange right !&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;BLOCKQUOTE&gt; 
 &lt;HR /&gt; 
 &lt;A href="https://community.qlik.com/s/profile/005390000069RuGAAU"&gt;@rhall&lt;/A&gt;&amp;nbsp;wrote: 
 &lt;BR /&gt; 
 &lt;P&gt;3) How are the components you are iterating to configured? How are you passing the data and are they set to be created brand new on each iteration?&lt;/P&gt; 
 &lt;HR /&gt; 
&lt;/BLOCKQUOTE&gt; 
&lt;P&gt;The first component is the "tFileExcelWorkbookOpen", which contain the path to my Excel file.&lt;/P&gt; 
&lt;P&gt;The second one, tFileSheetInput is configured like this :&lt;/P&gt; 
&lt;P&gt;&lt;SPAN class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="image.png" style="width: 999px;"&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009M72t.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/147989i3C095DD69639CC37/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009M72t.png" alt="0683p000009M72t.png" /&gt;&lt;/span&gt;&lt;/SPAN&gt;&lt;/P&gt; 
&lt;P&gt;"Structur my sport" is the name of the sheet inside the Excel.&lt;/P&gt; 
&lt;P&gt;"4" is the header, it begins to read the data at the line 4 of my excel&lt;/P&gt; 
&lt;P&gt;"R" is the column where the data needs to be read&lt;/P&gt; 
&lt;P&gt;See the screenshot below&lt;/P&gt; 
&lt;P&gt;&lt;SPAN class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="image.png" style="width: 400px;"&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009M6zI.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/133468i9BFCEAAE863C8A53/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009M6zI.png" alt="0683p000009M6zI.png" /&gt;&lt;/span&gt;&lt;/SPAN&gt;&lt;/P&gt; 
&lt;BLOCKQUOTE&gt; 
 &lt;HR /&gt; 
 &lt;A href="https://community.qlik.com/s/profile/005390000069RuGAAU"&gt;@rhall&lt;/A&gt;&amp;nbsp;wrote: 
 &lt;BR /&gt; 
 &lt;P&gt;3) How are you passing the data and are they set to be created brand new on each iteration?&lt;/P&gt; 
 &lt;HR /&gt; 
&lt;/BLOCKQUOTE&gt; 
&lt;P&gt;I think this is a good point, at each iteration, the file might be read ; when i'm running this job, I can see the iteration increasing through time, but it takes 1 minute and a half to read the entire file (with another sheet with 1600 lines).&lt;/P&gt; 
&lt;P&gt;But when I'm doing the same with tFileInputExcel, it takes 4 seconds.&lt;/P&gt;</description>
      <pubDate>Tue, 20 Aug 2019 15:37:09 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Job-optimization-Excel-file-reading/m-p/2204305#M5240</guid>
      <dc:creator>jeoste</dc:creator>
      <dc:date>2019-08-20T15:37:09Z</dc:date>
    </item>
    <item>
      <title>Re: Job optimization - Excel file reading</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Job-optimization-Excel-file-reading/m-p/2204306#M5241</link>
      <description>&lt;BLOCKQUOTE&gt; 
 &lt;HR /&gt; 
 &lt;A href="https://community.qlik.com/s/profile/005390000069RuGAAU"&gt;@rhall&lt;/A&gt;&amp;nbsp;wrote: 
 &lt;BR /&gt; 
 &lt;P&gt;An observation, you appear to be using OnComponentOK links where it is possibly better to be using OnSubJobOK links. You certainly need to adjust this. At the moment the flow of your job is very difficult to figure out&lt;/P&gt; 
 &lt;HR /&gt; 
&lt;/BLOCKQUOTE&gt; 
&lt;P&gt;Thanks you, I changed all the links I could with OnSubjobOk instead of onComponentOk&lt;/P&gt;</description>
      <pubDate>Tue, 20 Aug 2019 15:39:10 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Job-optimization-Excel-file-reading/m-p/2204306#M5241</guid>
      <dc:creator>jeoste</dc:creator>
      <dc:date>2019-08-20T15:39:10Z</dc:date>
    </item>
    <item>
      <title>Re: Job optimization - Excel file reading</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Job-optimization-Excel-file-reading/m-p/2204307#M5242</link>
      <description>&lt;P&gt;OK, we might have found a way in which your job is not very efficient. I'm not entirely sure what the iterate link will do from the&amp;nbsp;&lt;SPAN&gt;tFileExcelWorkbookOpen component, but since you have hardcoded your sheet in the next component, you should probably use another link. The&amp;nbsp;tFileExcelWorkbookOpen simply provides a "connection" to your Excel file (by the look of it). I presume that this could go at the very beginning of your job (with all and any other connections). You can connect these with OnSubJobOk links.&lt;/SPAN&gt;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&lt;SPAN&gt;The other issue that I have spotted is that you are opening this Excel file multiple times at once. You open it for each job and since each of the jobs is "inside" another one, you are being really inefficient with your memory. Why do you need 3 jobs? Could you not retrieve all of the data in one job? There are components called tHashInput and tHashOutput. These store data in memory. What you could do is load the data into a tHashOutput (one per sheet maybe) and then read it back using corresponding tHashInput components in the same job. You can load the data in order this way by reading from the tHashInputs in order. Something like this flow.....&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;DB Connection&lt;BR /&gt;|&lt;/SPAN&gt;&lt;/P&gt; 
&lt;P&gt;&lt;SPAN&gt;OnSubJobOK&lt;BR /&gt;|&lt;/SPAN&gt;&lt;/P&gt; 
&lt;P&gt;&lt;SPAN&gt;tFileExcelWorkbookOpen&lt;BR /&gt;|&lt;BR /&gt;OnSubJobOK&lt;/SPAN&gt;&lt;/P&gt; 
&lt;P&gt;&lt;SPAN&gt;|&lt;/SPAN&gt;&lt;/P&gt; 
&lt;P&gt;&lt;SPAN&gt;tFileSheetInput----&amp;gt;tHashOutput (Sheet1)&lt;/SPAN&gt;&lt;/P&gt; 
&lt;P&gt;&lt;SPAN&gt;|&lt;/SPAN&gt;&lt;/P&gt; 
&lt;P&gt;&lt;SPAN&gt;OnSubJobOK&lt;/SPAN&gt;&lt;/P&gt; 
&lt;P&gt;&lt;SPAN&gt;|&lt;/SPAN&gt;&lt;/P&gt; 
&lt;P&gt;&lt;SPAN&gt;tFileSheetInput----&amp;gt;tHashOutput (Sheet2)&lt;/SPAN&gt;&lt;/P&gt; 
&lt;P&gt;&lt;SPAN&gt;|&lt;/SPAN&gt;&lt;/P&gt; 
&lt;P&gt;&lt;SPAN&gt;OnSubJobOK&lt;/SPAN&gt;&lt;/P&gt; 
&lt;P&gt;&lt;SPAN&gt;|&lt;/SPAN&gt;&lt;/P&gt; 
&lt;P&gt;&lt;SPAN&gt;tFileSheetInput----&amp;gt;tHashOutput (Sheet3)&lt;BR /&gt;|&lt;/SPAN&gt;&lt;/P&gt; 
&lt;P&gt;&lt;SPAN&gt;OnSubJobOK&lt;/SPAN&gt;&lt;/P&gt; 
&lt;P&gt;&lt;SPAN&gt;|&lt;/SPAN&gt;&lt;/P&gt; 
&lt;P&gt;&lt;SPAN&gt;tHashInput ----&amp;gt; Database (Sheet 1)&lt;/SPAN&gt;&lt;/P&gt; 
&lt;P&gt;&lt;SPAN&gt;|&lt;/SPAN&gt;&lt;/P&gt; 
&lt;P&gt;&lt;SPAN&gt;OnSubJobOK&lt;/SPAN&gt;&lt;/P&gt; 
&lt;P&gt;&lt;SPAN&gt;|&lt;/SPAN&gt;&lt;/P&gt; 
&lt;P&gt;&lt;SPAN&gt;tHashInput ----&amp;gt; Database (Sheet 2)&lt;/SPAN&gt;&lt;/P&gt; 
&lt;P&gt;&lt;SPAN&gt;|&lt;/SPAN&gt;&lt;/P&gt; 
&lt;P&gt;&lt;SPAN&gt;OnSubJobOK&lt;/SPAN&gt;&lt;/P&gt; 
&lt;P&gt;&lt;SPAN&gt;|&lt;/SPAN&gt;&lt;/P&gt; 
&lt;P&gt;&lt;SPAN&gt;tHashInput ----&amp;gt; Database (Sheet 3)&lt;/SPAN&gt;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&lt;SPAN&gt;The above is obviously a very high level description. But what you are doing there is reading the data only once from the Excel file and only creating one connection.&lt;/SPAN&gt;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 20 Aug 2019 16:47:42 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Job-optimization-Excel-file-reading/m-p/2204307#M5242</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2019-08-20T16:47:42Z</dc:date>
    </item>
    <item>
      <title>Re: Job optimization - Excel file reading</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Job-optimization-Excel-file-reading/m-p/2204308#M5243</link>
      <description>&lt;P&gt;Your design ist ok, but actually you do not need to cache the data in the tHash components. You can write them directly in the flow starting with the tFileExcelSheetInput components.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 20 Aug 2019 17:45:47 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Job-optimization-Excel-file-reading/m-p/2204308#M5243</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2019-08-20T17:45:47Z</dc:date>
    </item>
    <item>
      <title>Re: Job optimization - Excel file reading</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Job-optimization-Excel-file-reading/m-p/2204309#M5244</link>
      <description>&lt;P&gt;You're right&amp;nbsp;&lt;A href="https://community.qlik.com/s/profile/0053p000007LKk8AAG"&gt;@lli&lt;/A&gt;, I was thinking from the perspective of the Talend Excel components (not yours) and that the data would be used multiple times. Thanks for picking that up &lt;span class="lia-unicode-emoji" title=":winking_face:"&gt;😉&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 20 Aug 2019 20:26:15 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Job-optimization-Excel-file-reading/m-p/2204309#M5244</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2019-08-20T20:26:15Z</dc:date>
    </item>
    <item>
      <title>Re: Job optimization - Excel file reading</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Job-optimization-Excel-file-reading/m-p/2204310#M5245</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt; 
&lt;P&gt;Once again, thanks for your responses &lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009MACn.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/154443iC5B8CACEF3D12C6A/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009MACn.png" alt="0683p000009MACn.png" /&gt;&lt;/span&gt;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&lt;A href="https://community.qlik.com/s/profile/005390000069RuGAAU"&gt;@rhall&lt;/A&gt;I tried your design I followed the &lt;A href="https://community.qlik.com/s/profile/0053p000007LKk8AAG"&gt;@lli&lt;/A&gt; ' tips so I didn't used the tHash components, and it works perfectly on my machine !&lt;/P&gt; 
&lt;P&gt;I have to wait the end of the week to test it on the server, because my colleague is the only one who can access it, but I'm very confident about the result. I'll accept your solution as the good answer !&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;More info about the optimization result :&lt;/P&gt; 
&lt;P&gt;For now I've made 3 tests. The first one with all the components (this is in the previous attachments), it took 4GB of RAM, and failed due to the outofMemory's error.&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;Second one, with the first optimization, I just deleted many tFileExcelWorkbookOpen as you recommended, here's the graph in jvisualvm&lt;/P&gt; 
&lt;P&gt;&lt;SPAN class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="image.png" style="width: 999px;"&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009M70e.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/140074i9B3754AE3EEEFBFC/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009M70e.png" alt="0683p000009M70e.png" /&gt;&lt;/span&gt;&lt;/SPAN&gt;&lt;/P&gt; 
&lt;P&gt;Still a peek, which corresponds to some others jobs. I've decided to refacto this code too and this is the final result :&lt;/P&gt; 
&lt;P&gt;&lt;SPAN class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="image.png" style="width: 999px;"&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009M75T.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/128030i2951D0187F9152AD/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009M75T.png" alt="0683p000009M75T.png" /&gt;&lt;/span&gt;&lt;/SPAN&gt;&lt;/P&gt; 
&lt;P&gt;From 4GB to 1.2GB and 50% less time, just for one row.&lt;/P&gt;</description>
      <pubDate>Wed, 21 Aug 2019 10:44:44 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Job-optimization-Excel-file-reading/m-p/2204310#M5245</guid>
      <dc:creator>jeoste</dc:creator>
      <dc:date>2019-08-21T10:44:44Z</dc:date>
    </item>
    <item>
      <title>Re: Job optimization - Excel file reading</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Job-optimization-Excel-file-reading/m-p/2204311#M5246</link>
      <description>As promise here's a little update about this topic. 
&lt;BR /&gt; 
&lt;BR /&gt;The code above optimized the time factor by 50% ; but I've discovered that the main problem came from the Excel file. There's lots of operation, links between sheets and it costs tons of RAM (2.5Gb with the operations, 50Mb without) 
&lt;BR /&gt;To optimize again I copy the entire sheet, and I paste it "as value" in Excel. I did that for all the sheets. 
&lt;BR /&gt; 
&lt;BR /&gt;The execution is now around 12 sec with my local db instead of 14 min. 
&lt;BR /&gt;In the customer's environment it's 10 min, instead of 30+ min. 
&lt;BR /&gt;The RAM used dropped from 2GB to 1GB for the entire Excel file, which is perfect ! 
&lt;BR /&gt; 
&lt;BR /&gt;If someone has this case, don't forget to optimize your code but check by the same time your input file ! Excel is always guilty 
&lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009MA9p.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/138034i5F552429DA646D6F/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009MA9p.png" alt="0683p000009MA9p.png" /&gt;&lt;/span&gt;</description>
      <pubDate>Thu, 29 Aug 2019 18:30:51 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Job-optimization-Excel-file-reading/m-p/2204311#M5246</guid>
      <dc:creator>jeoste</dc:creator>
      <dc:date>2019-08-29T18:30:51Z</dc:date>
    </item>
  </channel>
</rss>

