<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic tFileInputExcel with more than approx. 2500 rows gives stackoverflow in Talend Studio</title>
    <link>https://community.qlik.com/t5/Talend-Studio/tFileInputExcel-with-more-than-approx-2500-rows-gives/m-p/2284723#M58357</link>
    <description>Does anyone else have problems with excel as input? It seems to be the Pattern (regex) which gives a stackoverflow when there is a bit more than 2500 rows. I have 7000 rows, but the error always comes after approx. 2500 rows have been processed. Exporting to CSV and doing the same parsing gives me no problems. As a sidenote, it seems much faster to process CSV.</description>
    <pubDate>Sat, 03 Nov 2012 20:30:13 GMT</pubDate>
    <dc:creator>Anonymous</dc:creator>
    <dc:date>2012-11-03T20:30:13Z</dc:date>
    <item>
      <title>tFileInputExcel with more than approx. 2500 rows gives stackoverflow</title>
      <link>https://community.qlik.com/t5/Talend-Studio/tFileInputExcel-with-more-than-approx-2500-rows-gives/m-p/2284723#M58357</link>
      <description>Does anyone else have problems with excel as input? It seems to be the Pattern (regex) which gives a stackoverflow when there is a bit more than 2500 rows. I have 7000 rows, but the error always comes after approx. 2500 rows have been processed. Exporting to CSV and doing the same parsing gives me no problems. As a sidenote, it seems much faster to process CSV.</description>
      <pubDate>Sat, 03 Nov 2012 20:30:13 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/tFileInputExcel-with-more-than-approx-2500-rows-gives/m-p/2284723#M58357</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2012-11-03T20:30:13Z</dc:date>
    </item>
    <item>
      <title>Re: tFileInputExcel with more than approx. 2500 rows gives stackoverflow</title>
      <link>https://community.qlik.com/t5/Talend-Studio/tFileInputExcel-with-more-than-approx-2500-rows-gives/m-p/2284724#M58358</link>
      <description>Hi, 
&lt;BR /&gt;How do you set your component tFileInputExcel? The purpose of Using Regex is that: select this check box if you want to use a regular expression to filter the sheets to process. Would you give us your screenshot for your job?
&lt;BR /&gt;Best regards
&lt;BR /&gt;Sabrina</description>
      <pubDate>Mon, 05 Nov 2012 02:41:41 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/tFileInputExcel-with-more-than-approx-2500-rows-gives/m-p/2284724#M58358</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2012-11-05T02:41:41Z</dc:date>
    </item>
    <item>
      <title>Re: tFileInputExcel with more than approx. 2500 rows gives stackoverflow</title>
      <link>https://community.qlik.com/t5/Talend-Studio/tFileInputExcel-with-more-than-approx-2500-rows-gives/m-p/2284725#M58359</link>
      <description>Hello
&lt;BR /&gt;I dropped it again due to the problem, and went for CSV.
&lt;BR /&gt;I was of course aware of not enabling or using regexp anywhere in this test.
&lt;BR /&gt;The scenario is simple: Make an excel with 7000 rows, and lets say, five columns. Content can be anything, and even the same for each row. Read it in and just use logRow.
&lt;BR /&gt;Regarding patterns it is specific to this component. I have since used the same data from CSV, where I converts it through tReplace, through double regexp's in a tMap, and have also used tAggregateRow. No problems in getting through all records.</description>
      <pubDate>Tue, 06 Nov 2012 18:09:36 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/tFileInputExcel-with-more-than-approx-2500-rows-gives/m-p/2284725#M58359</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2012-11-06T18:09:36Z</dc:date>
    </item>
    <item>
      <title>Re: tFileInputExcel with more than approx. 2500 rows gives stackoverflow</title>
      <link>https://community.qlik.com/t5/Talend-Studio/tFileInputExcel-with-more-than-approx-2500-rows-gives/m-p/2284726#M58360</link>
      <description>Hi, 
&lt;BR /&gt;
&lt;BLOCKQUOTE&gt;
 &lt;TABLE border="1"&gt;
  &lt;TBODY&gt;
   &lt;TR&gt;
    &lt;TD&gt;Regarding patterns it is specific to this component. I have since used the same data from CSV, where I converts it through tReplace, through double regexp's in a tMap, and have also used tAggregateRow&lt;/TD&gt;
   &lt;/TR&gt;
  &lt;/TBODY&gt;
 &lt;/TABLE&gt;
&lt;/BLOCKQUOTE&gt;
&lt;BR /&gt;You must be designed a job, would you minding uploading screenshot to us(especially the tMap). From your description, is the job flow tFileinputdelimited--&amp;gt;tReplace--&amp;gt;tMap--&amp;gt;tAggregateRow--&amp;gt;tLogrow, right? All is fine in .csv file but not excel? Need more info from you, thanks alot!
&lt;BR /&gt;Best regards
&lt;BR /&gt;Sabrina</description>
      <pubDate>Wed, 07 Nov 2012 02:23:59 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/tFileInputExcel-with-more-than-approx-2500-rows-gives/m-p/2284726#M58360</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2012-11-07T02:23:59Z</dc:date>
    </item>
    <item>
      <title>Re: tFileInputExcel with more than approx. 2500 rows gives stackoverflow</title>
      <link>https://community.qlik.com/t5/Talend-Studio/tFileInputExcel-with-more-than-approx-2500-rows-gives/m-p/2284727#M58361</link>
      <description>I am seeing a similar issue.  I am loading a date dimension from an excel spreadsheet and the tFileInputExcel fails after about 2600 records.</description>
      <pubDate>Wed, 12 Dec 2012 18:50:25 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/tFileInputExcel-with-more-than-approx-2500-rows-gives/m-p/2284727#M58361</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2012-12-12T18:50:25Z</dc:date>
    </item>
    <item>
      <title>Re: tFileInputExcel with more than approx. 2500 rows gives stackoverflow</title>
      <link>https://community.qlik.com/t5/Talend-Studio/tFileInputExcel-with-more-than-approx-2500-rows-gives/m-p/2284728#M58362</link>
      <description>Thank you jmagana 
&lt;BR /&gt;To xdshi. There is nothing more to it than I stated. No advanced designs needed. Just try it.</description>
      <pubDate>Wed, 12 Dec 2012 19:26:31 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/tFileInputExcel-with-more-than-approx-2500-rows-gives/m-p/2284728#M58362</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2012-12-12T19:26:31Z</dc:date>
    </item>
    <item>
      <title>Re: tFileInputExcel with more than approx. 2500 rows gives stackoverflow</title>
      <link>https://community.qlik.com/t5/Talend-Studio/tFileInputExcel-with-more-than-approx-2500-rows-gives/m-p/2284729#M58363</link>
      <description>Hi Jojs
&lt;BR /&gt;For testing, I am reading 10000 rows from excel file on v5.2.1 and it works, which version are you using? Do I miss something to reproduce the problem? What do you mean "Pattern (regex) which gives a stackoverflow"?
&lt;BR /&gt;Shong</description>
      <pubDate>Mon, 18 Feb 2013 03:25:36 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/tFileInputExcel-with-more-than-approx-2500-rows-gives/m-p/2284729#M58363</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2013-02-18T03:25:36Z</dc:date>
    </item>
    <item>
      <title>Re: tFileInputExcel with more than approx. 2500 rows gives stackoverflow</title>
      <link>https://community.qlik.com/t5/Talend-Studio/tFileInputExcel-with-more-than-approx-2500-rows-gives/m-p/2284730#M58364</link>
      <description>Hello 
&lt;BR /&gt;It is excel 2007 (xlsx). I attach a job which fails, including test-data "test material.xlsx". (Ups, I can't attach zip-files?, well, the explanation below should be enough) 
&lt;BR /&gt;This test was run on mac OSX 10.8.2 with TOS 5.2.1.r95162 
&lt;BR /&gt;I did one additional test, where I copied only formats, numbers, dates and text into a new sheet, and deleted the original sheet. Now I can read all records, and it is much faster. This sheet is named "test material2.xlsx" 
&lt;BR /&gt;The sheet have the following columns: 
&lt;BR /&gt;id, Type, Color, Type title, Date, Revision, Title, Collection 
&lt;BR /&gt;I just realize that the first column with ids is made like this 
&lt;BR /&gt;1 
&lt;BR /&gt;=+A2+1 
&lt;BR /&gt;=+A3+1 
&lt;BR /&gt;... 
&lt;BR /&gt;When I replace that with pure numbers, it can be read with no problems. 
&lt;BR /&gt;Regarding the observation about regular expressions, I would like to quote from the documentation: 
&lt;BR /&gt;tFileInputExcel opens a file and reads it row by row to split data up into fields using regular expressions. 
&lt;BR /&gt;Now I tried to change the advanced setting for "Generation mode" from "Memory-consuming", to "Less memory consumed". That will also do the trick, and actually "Less memory consumed" reads all sheets faster than when using "Memory-consuming" 
&lt;BR /&gt;So I guess that formulas in the sheet and "Memory-consuming" do not work so well together. 
&lt;BR /&gt;And based on tests, it seems that "Less memory consumed" is faster anyway. 
&lt;BR /&gt;Best regards - Jojs</description>
      <pubDate>Sat, 02 Mar 2013 13:55:48 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/tFileInputExcel-with-more-than-approx-2500-rows-gives/m-p/2284730#M58364</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2013-03-02T13:55:48Z</dc:date>
    </item>
    <item>
      <title>Re: tFileInputExcel with more than approx. 2500 rows gives stackoverflow</title>
      <link>https://community.qlik.com/t5/Talend-Studio/tFileInputExcel-with-more-than-approx-2500-rows-gives/m-p/2284731#M58365</link>
      <description>Hi Jojs, 
&lt;BR /&gt;It suggested that you should open a new topic for your issue so that more persons in forum will see it. In addition, could you upload your job screenshots into forum to help us to address your issue. 
&lt;BR /&gt;Best regards 
&lt;BR /&gt;Sabrina</description>
      <pubDate>Mon, 04 Mar 2013 02:33:40 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/tFileInputExcel-with-more-than-approx-2500-rows-gives/m-p/2284731#M58365</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2013-03-04T02:33:40Z</dc:date>
    </item>
    <item>
      <title>Re: tFileInputExcel with more than approx. 2500 rows gives stackoverflow</title>
      <link>https://community.qlik.com/t5/Talend-Studio/tFileInputExcel-with-more-than-approx-2500-rows-gives/m-p/2284732#M58366</link>
      <description>As simple as this attached. Does it tell you someting new? 
&lt;BR /&gt;I am not sure what you mean about starting a new topic? About what, and where, and for whom? 
&lt;BR /&gt;To me it sounds like a bug for "Memory-consuming" "Generation mode". But I would question the reason to even have that mode, since it is slower even for simple files. I suggest that Talend either remove that generation mode, fix the bug or document it with the component. 
&lt;BR /&gt;(On another note, there is a similar option when parsing XML-files, which also is questionable if it is about performance) 
&lt;BR /&gt; 
&lt;BR /&gt;Best regards - Jojs 
&lt;BR /&gt;http://www.talendforge.org/forum/img/members/60190/mini_103940_Sk</description>
      <pubDate>Tue, 05 Mar 2013 19:19:47 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/tFileInputExcel-with-more-than-approx-2500-rows-gives/m-p/2284732#M58366</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2013-03-05T19:19:47Z</dc:date>
    </item>
    <item>
      <title>Re: tFileInputExcel with more than approx. 2500 rows gives stackoverflow</title>
      <link>https://community.qlik.com/t5/Talend-Studio/tFileInputExcel-with-more-than-approx-2500-rows-gives/m-p/2284733#M58367</link>
      <description>Hi, 
&lt;BR /&gt;Thank you a comprehensive testing and summary. 
&lt;BR /&gt;Available when Read excel2007 file format (xlsx) is selected in the Basic settings view. Select the mode used to read the Excel 2007 file. 
&lt;BR /&gt;Generation mode: 
&lt;BR /&gt;Less memory consumed for large excel(Event mode): used for large file. This is a memory-saving mode to read the Excel 2007 file as a flow. 
&lt;BR /&gt;Memory-consuming (User mode): used for small file. It needs much memory. That is the reason 
&lt;BLOCKQUOTE&gt; 
 &lt;TABLE border="1"&gt; 
  &lt;TBODY&gt; 
   &lt;TR&gt; 
    &lt;TD&gt;Now I tried to change the advanced setting for "Generation mode" from "Memory-consuming", to "Less memory consumed". That will also do the trick, and actually "Less memory consumed" reads all sheets faster than when using "Memory-consuming"&lt;BR /&gt;So I guess that formulas in the sheet and "Memory-consuming" do not work so well together.&lt;BR /&gt;And based on tests, it seems that "Less memory consumed" is faster anyway.&lt;/TD&gt; 
   &lt;/TR&gt; 
  &lt;/TBODY&gt; 
 &lt;/TABLE&gt; 
&lt;/BLOCKQUOTE&gt; 
&lt;BR /&gt;For more details, see the component reference 
&lt;A href="https://help.talend.com/search/all?query=tFileInputExcel&amp;amp;content-lang=en" target="_blank" rel="nofollow noopener noreferrer"&gt;tFileInputExcel&lt;/A&gt; 
&lt;BR /&gt;Best regards 
&lt;BR /&gt;Sabrina</description>
      <pubDate>Wed, 06 Mar 2013 09:41:50 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/tFileInputExcel-with-more-than-approx-2500-rows-gives/m-p/2284733#M58367</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2013-03-06T09:41:50Z</dc:date>
    </item>
    <item>
      <title>Re: tFileInputExcel with more than approx. 2500 rows gives stackoverflow</title>
      <link>https://community.qlik.com/t5/Talend-Studio/tFileInputExcel-with-more-than-approx-2500-rows-gives/m-p/2284734#M58368</link>
      <description>&lt;BLOCKQUOTE&gt; 
 &lt;TABLE border="1"&gt; 
  &lt;TBODY&gt; 
   &lt;TR&gt; 
    &lt;TD&gt;Memory-consuming (User mode): used for small file. It needs much memory. That is the reason&lt;/TD&gt; 
   &lt;/TR&gt; 
  &lt;/TBODY&gt; 
 &lt;/TABLE&gt; 
&lt;/BLOCKQUOTE&gt; 
&lt;BR /&gt;Yes, I read that. But that documentation does not propose any benefit of using that mode. I don't think that "It needs much memory" qualifies as a benefit. And what is a "small" file anyway? 
&lt;BR /&gt;Based on my small test, I see no benefit in that "small but memory-consuming" mode at all. In my test, it is not any faster (which would be a benefit). 
&lt;BR /&gt;My suggestion to remove it was more like a suggestion for an easy fix and less clutter at the same time. But I will leave that up to the Talend team. 
&lt;BR /&gt;(I apologise for mixing in a comment about xml. That DOES have some special purpose, although I have yet to finde examples taking advantage of it)</description>
      <pubDate>Wed, 06 Mar 2013 11:43:10 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/tFileInputExcel-with-more-than-approx-2500-rows-gives/m-p/2284734#M58368</guid>
      <dc:creator>Jsi</dc:creator>
      <dc:date>2013-03-06T11:43:10Z</dc:date>
    </item>
    <item>
      <title>Re: tFileInputExcel with more than approx. 2500 rows gives stackoverflow</title>
      <link>https://community.qlik.com/t5/Talend-Studio/tFileInputExcel-with-more-than-approx-2500-rows-gives/m-p/2284735#M58369</link>
      <description>Hi Jsi, 
&lt;BR /&gt;Could you open an issue for the Doc team, in 
&lt;A href="https://jira.talendforge.org/browse/DOCT" rel="nofollow noopener noreferrer"&gt;https://jira.talendforge.org/browse/DOCT&lt;/A&gt;, please? 
&lt;BR /&gt;We need to be more accurate in the documentation, it seems, and we also need to clarify with the dev team in which cases it is interesting to use the more "memory-consuming" mode. We missed that point, it seems. 
&lt;BR /&gt;Cheers, 
&lt;BR /&gt;Elisa</description>
      <pubDate>Wed, 06 Mar 2013 12:34:42 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/tFileInputExcel-with-more-than-approx-2500-rows-gives/m-p/2284735#M58369</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2013-03-06T12:34:42Z</dc:date>
    </item>
  </channel>
</rss>

