<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic [resolved] tSortRow and Large Files in Talend Studio</title>
    <link>https://community.qlik.com/t5/Talend-Studio/resolved-tSortRow-and-Large-Files/m-p/2199656#M2474</link>
    <description>I am just starting with TOS 5.6.0 and I am trying to sort a large CSV file (2.5GB, 11M rows, 45 columns). &amp;nbsp;I am setting JVM to 2GB and I've tried various sizes of buffer for the external sort in Advanced tab. &amp;nbsp;The error stack shows that the out-of-memory occurs in various places, but the results is always similar to: 
&lt;BR /&gt;Exception in thread "Thread-0" java.lang.OutOfMemoryError: Java heap space 
&lt;BR /&gt;at java.util.LinkedList.listIterator(LinkedList.java:667) 
&lt;BR /&gt;at java.util.AbstractList.listIterator(AbstractList.java:284) 
&lt;BR /&gt;at java.util.AbstractSequentialList.iterator(AbstractSequentialList.java:222) 
&lt;BR /&gt;at routines.system.RunStat.sendMessages(RunStat.java:261) 
&lt;BR /&gt;at routines.system.RunStat.run(RunStat.java:225) 
&lt;BR /&gt;at java.lang.Thread.run(Thread.java:662) 
&lt;BR /&gt;Exception in thread "main" java.lang.OutOfMemoryError: Java heap space 
&lt;BR /&gt;at java.lang.StringBuilder.toString(StringBuilder.java:430) 
&lt;BR /&gt;at com.talend.csv.CSVReader.endColumn(CSVReader.java:131) 
&lt;BR /&gt;at com.talend.csv.CSVReader.readNext(CSVReader.java:301) 
&lt;BR /&gt;at johnmdm.sqlinout_0_1.SQLInOut.tFileInputDelimited_1Process(SQLInOut.java:3380) 
&lt;BR /&gt;at johnmdm.sqlinout_0_1.SQLInOut.runJobInTOS(SQLInOut.java:5199) 
&lt;BR /&gt;at johnmdm.sqlinout_0_1.SQLInOut.main(SQLInOut.java:5056) 
&lt;BR /&gt; 
&lt;FONT size="2"&gt;--john&lt;/FONT&gt;</description>
    <pubDate>Sat, 16 Nov 2024 11:25:09 GMT</pubDate>
    <dc:creator>Anonymous</dc:creator>
    <dc:date>2024-11-16T11:25:09Z</dc:date>
    <item>
      <title>[resolved] tSortRow and Large Files</title>
      <link>https://community.qlik.com/t5/Talend-Studio/resolved-tSortRow-and-Large-Files/m-p/2199656#M2474</link>
      <description>I am just starting with TOS 5.6.0 and I am trying to sort a large CSV file (2.5GB, 11M rows, 45 columns). &amp;nbsp;I am setting JVM to 2GB and I've tried various sizes of buffer for the external sort in Advanced tab. &amp;nbsp;The error stack shows that the out-of-memory occurs in various places, but the results is always similar to: 
&lt;BR /&gt;Exception in thread "Thread-0" java.lang.OutOfMemoryError: Java heap space 
&lt;BR /&gt;at java.util.LinkedList.listIterator(LinkedList.java:667) 
&lt;BR /&gt;at java.util.AbstractList.listIterator(AbstractList.java:284) 
&lt;BR /&gt;at java.util.AbstractSequentialList.iterator(AbstractSequentialList.java:222) 
&lt;BR /&gt;at routines.system.RunStat.sendMessages(RunStat.java:261) 
&lt;BR /&gt;at routines.system.RunStat.run(RunStat.java:225) 
&lt;BR /&gt;at java.lang.Thread.run(Thread.java:662) 
&lt;BR /&gt;Exception in thread "main" java.lang.OutOfMemoryError: Java heap space 
&lt;BR /&gt;at java.lang.StringBuilder.toString(StringBuilder.java:430) 
&lt;BR /&gt;at com.talend.csv.CSVReader.endColumn(CSVReader.java:131) 
&lt;BR /&gt;at com.talend.csv.CSVReader.readNext(CSVReader.java:301) 
&lt;BR /&gt;at johnmdm.sqlinout_0_1.SQLInOut.tFileInputDelimited_1Process(SQLInOut.java:3380) 
&lt;BR /&gt;at johnmdm.sqlinout_0_1.SQLInOut.runJobInTOS(SQLInOut.java:5199) 
&lt;BR /&gt;at johnmdm.sqlinout_0_1.SQLInOut.main(SQLInOut.java:5056) 
&lt;BR /&gt; 
&lt;FONT size="2"&gt;--john&lt;/FONT&gt;</description>
      <pubDate>Sat, 16 Nov 2024 11:25:09 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/resolved-tSortRow-and-Large-Files/m-p/2199656#M2474</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2024-11-16T11:25:09Z</dc:date>
    </item>
    <item>
      <title>Re: [resolved] tSortRow and Large Files</title>
      <link>https://community.qlik.com/t5/Talend-Studio/resolved-tSortRow-and-Large-Files/m-p/2199657#M2475</link>
      <description>Hi 
&lt;BR /&gt;Take a look at this KB 
&lt;A href="https://community.qlik.com/s/article/ka03p0000006EZuAAM" target="_blank"&gt;article&lt;/A&gt;, to resolve this error, try to store the data on disk instead of memory, check the 'sort on disk' box on the advanced setting tab of tSortRow component. 
&lt;BR /&gt;Best regards 
&lt;BR /&gt;Shong</description>
      <pubDate>Mon, 24 Nov 2014 06:55:07 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/resolved-tSortRow-and-Large-Files/m-p/2199657#M2475</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2014-11-24T06:55:07Z</dc:date>
    </item>
    <item>
      <title>Re: [resolved] tSortRow and Large Files</title>
      <link>https://community.qlik.com/t5/Talend-Studio/resolved-tSortRow-and-Large-Files/m-p/2199658#M2476</link>
      <description>&lt;FONT size="1"&gt;&lt;FONT face="Verdana, Helvetica, Arial, sans-serif"&gt;Hi, I am trying to solve an performance issue around sorting huge file(50 Million record) to be sorted on Integer column+Alpha column(file has 6 columns). tSort takes around 30 mins with enabling sort on disk .&lt;/FONT&gt;&lt;/FONT&gt;
&lt;BR /&gt;
&lt;FONT face="Verdana, Helvetica, Arial, sans-serif"&gt;&lt;FONT size="1"&gt;I am using TOS 5.6.2 and evaluating this sort for my POC . Please advise and the&amp;nbsp;optimized&amp;nbsp;job design .&lt;/FONT&gt;&lt;/FONT&gt;</description>
      <pubDate>Wed, 27 Jan 2016 13:56:57 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/resolved-tSortRow-and-Large-Files/m-p/2199658#M2476</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2016-01-27T13:56:57Z</dc:date>
    </item>
  </channel>
</rss>

