<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: tStandardizeRow Usage? in Talend Studio</title>
    <link>https://community.qlik.com/t5/Talend-Studio/tStandardizeRow-Usage/m-p/2267265#M46224</link>
    <description>Finally,&lt;BR /&gt;i have used a tHDFSinput followed by a tMap.&lt;BR /&gt;The tmap does a substring on input rows.&lt;BR /&gt;Do you think it is a good solution? &lt;BR /&gt;I am working with very big file (90gb)&lt;BR /&gt;&lt;BR /&gt;Best regards</description>
    <pubDate>Thu, 20 Feb 2014 10:34:59 GMT</pubDate>
    <dc:creator>_AnonymousUser</dc:creator>
    <dc:date>2014-02-20T10:34:59Z</dc:date>
    <item>
      <title>tStandardizeRow Usage?</title>
      <link>https://community.qlik.com/t5/Talend-Studio/tStandardizeRow-Usage/m-p/2267262#M46221</link>
      <description>Hello i have a file not delimited and i would like to parse it&lt;BR /&gt;Would it be possible to split my file(according to row lengths) by using RegEx ?&lt;BR /&gt;For exemple i want to say:&lt;BR /&gt;the 1st row is from 1 to 7 char, the 2nd is from 8 to 12 ...&lt;BR /&gt;Is it possible? Where can i configure it?&lt;BR /&gt;Than you in advance</description>
      <pubDate>Wed, 19 Feb 2014 15:49:56 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/tStandardizeRow-Usage/m-p/2267262#M46221</guid>
      <dc:creator>_AnonymousUser</dc:creator>
      <dc:date>2014-02-19T15:49:56Z</dc:date>
    </item>
    <item>
      <title>Re: tStandardizeRow Usage?</title>
      <link>https://community.qlik.com/t5/Talend-Studio/tStandardizeRow-Usage/m-p/2267263#M46222</link>
      <description>Hi, 
&lt;BR /&gt;Regarding your previous post 
&lt;A href="https://community.qlik.com/s/feed/0D53p00007vCmnXCAS" target="_blank" rel="nofollow noopener noreferrer"&gt;https://community.talend.com/t5/Design-and-Development/Big-Data-Positional-File/td-p/85416&lt;/A&gt;, it seems you have to use MapReduce job. 
&lt;BR /&gt;If so, 
&lt;A href="https://help.talend.com/search/all?query=tFileInputRegex&amp;amp;content-lang=en" target="_blank" rel="nofollow noopener noreferrer"&gt;TalendHelpCenter:tFileInputRegex&lt;/A&gt; haven't supported for MapReduce yet. 
&lt;BR /&gt;Here is a solution for your use case: Put your file into Hadoop firstly then tHDFSInput ---&amp;gt; tMap(tHDFSInput---&amp;gt; tJavaMR). 
&lt;BR /&gt;Best regards 
&lt;BR /&gt;Sabrina</description>
      <pubDate>Thu, 20 Feb 2014 02:33:01 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/tStandardizeRow-Usage/m-p/2267263#M46222</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2014-02-20T02:33:01Z</dc:date>
    </item>
    <item>
      <title>Re: tStandardizeRow Usage?</title>
      <link>https://community.qlik.com/t5/Talend-Studio/tStandardizeRow-Usage/m-p/2267264#M46223</link>
      <description>Hi Sabrina,&lt;BR /&gt;Thank you for your attention,&lt;BR /&gt;So, i will use tHDFSInput (with a single column schema , raw string)-&amp;gt; a tjavaMR (with my csv real columns ) -&amp;gt; tlogRow&lt;BR /&gt;&lt;BR /&gt;Is there something wrong according to you?</description>
      <pubDate>Thu, 20 Feb 2014 09:16:25 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/tStandardizeRow-Usage/m-p/2267264#M46223</guid>
      <dc:creator>_AnonymousUser</dc:creator>
      <dc:date>2014-02-20T09:16:25Z</dc:date>
    </item>
    <item>
      <title>Re: tStandardizeRow Usage?</title>
      <link>https://community.qlik.com/t5/Talend-Studio/tStandardizeRow-Usage/m-p/2267265#M46224</link>
      <description>Finally,&lt;BR /&gt;i have used a tHDFSinput followed by a tMap.&lt;BR /&gt;The tmap does a substring on input rows.&lt;BR /&gt;Do you think it is a good solution? &lt;BR /&gt;I am working with very big file (90gb)&lt;BR /&gt;&lt;BR /&gt;Best regards</description>
      <pubDate>Thu, 20 Feb 2014 10:34:59 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/tStandardizeRow-Usage/m-p/2267265#M46224</guid>
      <dc:creator>_AnonymousUser</dc:creator>
      <dc:date>2014-02-20T10:34:59Z</dc:date>
    </item>
    <item>
      <title>Re: tStandardizeRow Usage?</title>
      <link>https://community.qlik.com/t5/Talend-Studio/tStandardizeRow-Usage/m-p/2267266#M46225</link>
      <description>Hi, 
&lt;BR /&gt;In case there is any memory issue caused by big file for your job , could you please take a look at the online KB article 
&lt;BR /&gt; 
&lt;A href="https://community.qlik.com/s/article/ka03p0000006EZuAAM" target="_blank"&gt;TalendHelpCenter:ExceptionoutOfMemory&lt;/A&gt;. 
&lt;BR /&gt;Best regards 
&lt;BR /&gt;Sabrina</description>
      <pubDate>Fri, 21 Feb 2014 04:34:05 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/tStandardizeRow-Usage/m-p/2267266#M46225</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2014-02-21T04:34:05Z</dc:date>
    </item>
    <item>
      <title>Re: tStandardizeRow Usage?</title>
      <link>https://community.qlik.com/t5/Talend-Studio/tStandardizeRow-Usage/m-p/2267267#M46226</link>
      <description>Thank you Sabrina. 
&lt;BR /&gt;Can you confirm to me a last thibg? 
&lt;BR /&gt;Indeed, mapreduce jobs are played in my cluster, aren't they? 
&lt;BR /&gt; 
&lt;BR /&gt;So the memory exception should happen because of the tlog? If i directly insert the data in a database. It shouldn't happen no? 
&lt;BR /&gt; 
&lt;BR /&gt;Thank you a lot for your help Sabrina.</description>
      <pubDate>Fri, 21 Feb 2014 09:21:42 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/tStandardizeRow-Usage/m-p/2267267#M46226</guid>
      <dc:creator>_AnonymousUser</dc:creator>
      <dc:date>2014-02-21T09:21:42Z</dc:date>
    </item>
    <item>
      <title>Re: tStandardizeRow Usage?</title>
      <link>https://community.qlik.com/t5/Talend-Studio/tStandardizeRow-Usage/m-p/2267268#M46227</link>
      <description>Hi, 
&lt;BR /&gt;The tMap component is cache component consuming two much memory. You'd better store temp data on disk. 
&lt;BR /&gt; 
&lt;BLOCKQUOTE&gt; 
 &lt;TABLE border="1"&gt; 
  &lt;TBODY&gt; 
   &lt;TR&gt; 
    &lt;TD&gt;If i directly insert the data in a database. It shouldn't happen no?&lt;/TD&gt; 
   &lt;/TR&gt; 
  &lt;/TBODY&gt; 
 &lt;/TABLE&gt; 
&lt;/BLOCKQUOTE&gt; 
&lt;BR /&gt;It depends on your input data and your design. 
&lt;BR /&gt;There are several possible reasons for an outOfMemory Java exception to occur. Most common reasons for it include: 
&lt;BR /&gt;1:Running a Job which contains a number of buffer components such as tSortRow, tFilterRow, tMap, tAggregateRow, tHashOutput for example 
&lt;BR /&gt;2.Running a Job which processes a very large amount of data. 
&lt;BR /&gt; 
&lt;BR /&gt;Best regards 
&lt;BR /&gt;Sabrina</description>
      <pubDate>Fri, 21 Feb 2014 09:39:39 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/tStandardizeRow-Usage/m-p/2267268#M46227</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2014-02-21T09:39:39Z</dc:date>
    </item>
  </channel>
</rss>

