<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Performance on a job in Talend Studio</title>
    <link>https://community.qlik.com/t5/Talend-Studio/Performance-on-a-job/m-p/2199204#M2210</link>
    <description>Hi,&lt;BR /&gt;Some questions :&lt;BR /&gt;- How many rows are treated by your job ?&lt;BR /&gt;- Do you launch routines in your tMap ?&lt;BR /&gt;- Your databases are local or on network ?&lt;BR /&gt;And an experience :&lt;BR /&gt;Before launching the job, check the "Statistics" box to see where the dataflow is slow</description>
    <pubDate>Tue, 10 Jun 2008 08:22:59 GMT</pubDate>
    <dc:creator>Anonymous</dc:creator>
    <dc:date>2008-06-10T08:22:59Z</dc:date>
    <item>
      <title>Performance on a job</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Performance-on-a-job/m-p/2199203#M2209</link>
      <description>Hi, 
&lt;BR /&gt;For me, Talend is a very good solution but now, I try to do performance tests. And, my results are disastrous. 
&lt;BR /&gt;I join a screen of my job which discribe my business rules. 
&lt;BR /&gt;My project is to replace a loader done with Access. 
&lt;BR /&gt;With my current loader, the execution time of this job is 4minutes and 30 seconds. 
&lt;BR /&gt;With Talend, the execution time of this job is 1 hour and 28 minutes. 
&lt;BR /&gt;How can I improve the performance ? 
&lt;BR /&gt;Thx. 
&lt;BR /&gt; 
&lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009MCHG.jpg"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/141857iB35A78BD25EB4067/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009MCHG.jpg" alt="0683p000009MCHG.jpg" /&gt;&lt;/span&gt; 
&lt;IMG src="https://community.qlik.com/" /&gt;</description>
      <pubDate>Sat, 16 Nov 2024 14:20:30 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Performance-on-a-job/m-p/2199203#M2209</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2024-11-16T14:20:30Z</dc:date>
    </item>
    <item>
      <title>Re: Performance on a job</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Performance-on-a-job/m-p/2199204#M2210</link>
      <description>Hi,&lt;BR /&gt;Some questions :&lt;BR /&gt;- How many rows are treated by your job ?&lt;BR /&gt;- Do you launch routines in your tMap ?&lt;BR /&gt;- Your databases are local or on network ?&lt;BR /&gt;And an experience :&lt;BR /&gt;Before launching the job, check the "Statistics" box to see where the dataflow is slow</description>
      <pubDate>Tue, 10 Jun 2008 08:22:59 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Performance-on-a-job/m-p/2199204#M2210</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2008-06-10T08:22:59Z</dc:date>
    </item>
    <item>
      <title>Re: Performance on a job</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Performance-on-a-job/m-p/2199205#M2211</link>
      <description>There are 421 000 rows on my job. 
&lt;BR /&gt;Yes I launch some routines in my two tmap but not complex. 
&lt;BR /&gt;My database Oracle is on network and access is local. 
&lt;BR /&gt;It's exactly the same configuration that my other loader. 
&lt;BR /&gt;More precisions : on my job the tAccessInput and the tAccessOutput is the same database Access.</description>
      <pubDate>Tue, 10 Jun 2008 08:36:38 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Performance-on-a-job/m-p/2199205#M2211</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2008-06-10T08:36:38Z</dc:date>
    </item>
    <item>
      <title>Re: Performance on a job</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Performance-on-a-job/m-p/2199206#M2212</link>
      <description>Before launching the job, check the "Statistics" box to see where the dataflow is slow&lt;BR /&gt;With this, you'll be able to see where things are slow (is it on oracle or access ??)&lt;BR /&gt;Once you have check with statistics, try to separate access input and output in two files</description>
      <pubDate>Tue, 10 Jun 2008 08:42:45 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Performance-on-a-job/m-p/2199206#M2212</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2008-06-10T08:42:45Z</dc:date>
    </item>
    <item>
      <title>Re: Performance on a job</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Performance-on-a-job/m-p/2199207#M2213</link>
      <description>Show the statistics is not very interisting because the stat are identical in all the job about 120 rows/seconds</description>
      <pubDate>Tue, 10 Jun 2008 08:51:45 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Performance-on-a-job/m-p/2199207#M2213</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2008-06-10T08:51:45Z</dc:date>
    </item>
    <item>
      <title>Re: Performance on a job</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Performance-on-a-job/m-p/2199208#M2214</link>
      <description>it is interesting, it shows that dataflow is slowing at the entry point : your oracle database on network&lt;BR /&gt;Could you show a screenshot of your tOracleInput component's configuration ?</description>
      <pubDate>Tue, 10 Jun 2008 08:55:32 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Performance-on-a-job/m-p/2199208#M2214</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2008-06-10T08:55:32Z</dc:date>
    </item>
    <item>
      <title>Re: Performance on a job</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Performance-on-a-job/m-p/2199209#M2215</link>
      <description>Precision :
&lt;BR /&gt;Oracle is connect by ODBC. I try by JDBC and it's about the same perfs. 
&lt;BR /&gt;Access is local : my file is on my computer.
&lt;BR /&gt;Oracle is distant is on my network.
&lt;BR /&gt;I add a screen of my param in my first post. See top.</description>
      <pubDate>Wed, 11 Jun 2008 10:42:10 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Performance-on-a-job/m-p/2199209#M2215</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2008-06-11T10:42:10Z</dc:date>
    </item>
    <item>
      <title>Re: Performance on a job</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Performance-on-a-job/m-p/2199210#M2216</link>
      <description>Talend Open Studio generate Java or Perl code.&lt;BR /&gt;None of these language manage Access database natively.&lt;BR /&gt;Java Access DB Components communicate through an ODBC Bridge.&lt;BR /&gt;It will never be as fast as a native connexion or as fast as a real JDBC connexion.&lt;BR /&gt;Generally speaking, Input is not the problem. Writing is always longer...&lt;BR /&gt;That's why when you change tOracleInput, to ODBC it doesn't change the results.&lt;BR /&gt;You can try to tweak Avanced Settings / Autocommit value in your tAccessOutput : increase this value.&lt;BR /&gt;HTH,</description>
      <pubDate>Wed, 11 Jun 2008 11:51:18 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Performance-on-a-job/m-p/2199210#M2216</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2008-06-11T11:51:18Z</dc:date>
    </item>
    <item>
      <title>Re: Performance on a job</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Performance-on-a-job/m-p/2199211#M2217</link>
      <description>For me, it's much faster to read/write an SQL database than a flat file (I can reach 40000 rows/sec against 3000 for a flat file)
&lt;BR /&gt;All my jobs use Java</description>
      <pubDate>Wed, 11 Jun 2008 12:35:34 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Performance-on-a-job/m-p/2199211#M2217</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2008-06-11T12:35:34Z</dc:date>
    </item>
    <item>
      <title>Re: Performance on a job</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Performance-on-a-job/m-p/2199212#M2218</link>
      <description>I only said that reading is faster than writing.
&lt;BR /&gt;About writing to file, on my laptop with a quite slow hard disk, I can easily go up to 75000 rows per second.
&lt;BR /&gt;My Input file has 1 000 000 rows and 11 columns with different data types (Integer, String, and Date).
&lt;BR /&gt;I write to a simple tFileOuptutDelimited without any CSV options...
&lt;BR /&gt;My only "special" configuration is to temporary disable my Antivirus.
&lt;BR /&gt;Can you give me more details about your own tests ?</description>
      <pubDate>Wed, 11 Jun 2008 21:57:44 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Performance-on-a-job/m-p/2199212#M2218</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2008-06-11T21:57:44Z</dc:date>
    </item>
    <item>
      <title>Re: Performance on a job</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Performance-on-a-job/m-p/2199213#M2219</link>
      <description>Maybe I'm missing something - sorry to intrude.  &lt;BR /&gt;However, it was mentioned earlier that one of the databases in Oracle, yet I see no tOracle components in the job. How come?</description>
      <pubDate>Wed, 11 Jun 2008 22:28:22 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Performance-on-a-job/m-p/2199213#M2219</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2008-06-11T22:28:22Z</dc:date>
    </item>
    <item>
      <title>Re: Performance on a job</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Performance-on-a-job/m-p/2199214#M2220</link>
      <description>You don't see tOracleInput because I use the component ODBC to connect to the data warehouse Oracle. The tInput component is DWH on my screen.
&lt;BR /&gt;For Mhirt : You can load at 75000 rows/s on what type of database ? Oracle ? You display the statistics to see the performance or not ?
&lt;BR /&gt;Thx</description>
      <pubDate>Wed, 11 Jun 2008 23:21:50 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Performance-on-a-job/m-p/2199214#M2220</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2008-06-11T23:21:50Z</dc:date>
    </item>
    <item>
      <title>Re: Performance on a job</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Performance-on-a-job/m-p/2199215#M2221</link>
      <description>sorry suzchr, I get 75000 rows / s with file to file. My message was for Maverick (he is limited to only 3000 rows persecond and I don't understand why)
&lt;BR /&gt;For Databases, the best performance are obtained with bulk components (not available for Access).
&lt;BR /&gt;Otherwise, it's mainly relative to Autocommit tweaking
&lt;BR /&gt;In Java you can show statistics, it don't affect much performance.
&lt;BR /&gt;In Perl, it has more impacts..
&lt;BR /&gt;HTH,</description>
      <pubDate>Wed, 11 Jun 2008 23:37:20 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Performance-on-a-job/m-p/2199215#M2221</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2008-06-11T23:37:20Z</dc:date>
    </item>
    <item>
      <title>Re: Performance on a job</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Performance-on-a-job/m-p/2199216#M2222</link>
      <description>I'll checked again. 
&lt;BR /&gt;I was exagerating with 2000 rows/sec :s 
&lt;BR /&gt;It's 7000 rows/sec when reading from a delimited flat file with following specs : 
&lt;BR /&gt;- Number of rows : 7 000 000 
&lt;BR /&gt;- Number of columns : 12 
&lt;BR /&gt;- My job write to an excel file, If I write to a delemited file, the number of rows/sec is growing to 15000 
&lt;BR /&gt;- HDD speed : 7200 RPM 
&lt;BR /&gt;- I have an antivirus, but I cant disable it (SBS behind 
&lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009MACn.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/154443iC5B8CACEF3D12C6A/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009MACn.png" alt="0683p000009MACn.png" /&gt;&lt;/span&gt;) 
&lt;BR /&gt;But nevermind, I dont have any problem with this 
&lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009MACn.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/154443iC5B8CACEF3D12C6A/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009MACn.png" alt="0683p000009MACn.png" /&gt;&lt;/span&gt;</description>
      <pubDate>Thu, 12 Jun 2008 09:01:57 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Performance-on-a-job/m-p/2199216#M2222</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2008-06-12T09:01:57Z</dc:date>
    </item>
    <item>
      <title>Re: Performance on a job</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Performance-on-a-job/m-p/2199217#M2223</link>
      <description>mhirt, I have a question for you ! I see that your status is Talend Team. Do it significate that you work for Talend company ?</description>
      <pubDate>Thu, 12 Jun 2008 09:48:10 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Performance-on-a-job/m-p/2199217#M2223</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2008-06-12T09:48:10Z</dc:date>
    </item>
    <item>
      <title>Re: Performance on a job</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Performance-on-a-job/m-p/2199218#M2224</link>
      <description>I have an other question according to my job. To improve performance, I need to modify the commit on tAccessOutput. However I don't know if it's better with a big commit (each 20000 rows for example) or a little commit (each 10 rows for example). 
&lt;BR /&gt;I use an computer with 1Go of ram memories and my process write 422188 rows in my database Access.</description>
      <pubDate>Thu, 12 Jun 2008 09:52:39 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Performance-on-a-job/m-p/2199218#M2224</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2008-06-12T09:52:39Z</dc:date>
    </item>
    <item>
      <title>Re: Performance on a job</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Performance-on-a-job/m-p/2199219#M2225</link>
      <description>Somebody know how the commit is done if I write 0 like value in commit every ?</description>
      <pubDate>Thu, 12 Jun 2008 14:28:11 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Performance-on-a-job/m-p/2199219#M2225</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2008-06-12T14:28:11Z</dc:date>
    </item>
    <item>
      <title>Re: Performance on a job</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Performance-on-a-job/m-p/2199220#M2226</link>
      <description>suzchr, &lt;BR /&gt;&lt;BLOCKQUOTE&gt;&lt;TABLE border="1"&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD&gt;I have a question for you ! I see that your status is Talend Team. Do it significate that you work for Talend company ?&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;/BLOCKQUOTE&gt;&lt;BR /&gt;Yes I'm working for Talend ! &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;BR /&gt;&lt;BLOCKQUOTE&gt;&lt;TABLE border="1"&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD&gt;However I don't know if it's better with a big commit (each 20000 rows for example) or a little commit (each 10 rows for example).&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;/BLOCKQUOTE&gt;&lt;BR /&gt;In general, it's better with a big "commit every" value, but it not as simple as that.&lt;BR /&gt;You may have better performance with a commit every of 40000 than with a comit every of 50000.&lt;BR /&gt;You have to make tests to find the better value.&lt;BR /&gt;&lt;BLOCKQUOTE&gt;&lt;TABLE border="1"&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD&gt;Somebody know how the commit is done if I write 0 like value in commit every ?&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;/BLOCKQUOTE&gt;&lt;BR /&gt;With 0 or empty, there won't be any commit at all.&lt;BR /&gt;HTH,</description>
      <pubDate>Thu, 12 Jun 2008 23:41:03 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Performance-on-a-job/m-p/2199220#M2226</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2008-06-12T23:41:03Z</dc:date>
    </item>
    <item>
      <title>Re: Performance on a job</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Performance-on-a-job/m-p/2199221#M2227</link>
      <description>Thank you for all your answers ! 
&lt;BR /&gt;I realise benchmark in my job and after I will give my results. 
&lt;BR /&gt;My first impression is with Access the most efficiant is to commit every 1 values. It's rare but in my case it's like this.</description>
      <pubDate>Fri, 13 Jun 2008 09:44:08 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Performance-on-a-job/m-p/2199221#M2227</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2008-06-13T09:44:08Z</dc:date>
    </item>
    <item>
      <title>Re: Performance on a job</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Performance-on-a-job/m-p/2199222#M2228</link>
      <description>So I am realising my benchmark and the result are not good...&lt;BR /&gt;In fact I realize two types of benchmark. The first is the job complete and the best time is get with a commit value on the tAccessOutput at 10. The best time is 22 minutes versus 9 minutes with my loader in Access.&lt;BR /&gt;Then, I create the same job without write on Access (I delete the tComponentOutput). The time is 4min 40 seconds. This is very good.&lt;BR /&gt;Then, I create a job where I just write on Access. I write 500 000 lines generated by the tRowGenerator. The best time is get by a commit value at 125 000. This best time is 5 minutes 39 secondes. This is also efficient.&lt;BR /&gt;All in all, I create two job one which extract only the data and finish by the tBufferOutput component and an other which get the data of the job and write on Access. But the performance are bad. After two hours I have just write 125 000 rows.&lt;BR /&gt;&lt;BR /&gt;How can I do to improve my performance ? Someone has a good idea ?</description>
      <pubDate>Mon, 16 Jun 2008 13:37:38 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Performance-on-a-job/m-p/2199222#M2228</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2008-06-16T13:37:38Z</dc:date>
    </item>
  </channel>
</rss>

