<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: tStatCatcher improvements in Talend Studio</title>
    <link>https://community.qlik.com/t5/Talend-Studio/tStatCatcher-improvements/m-p/2326926#M96362</link>
    <description>&lt;BLOCKQUOTE&gt; 
 &lt;TABLE border="1"&gt; 
  &lt;TBODY&gt; 
   &lt;TR&gt; 
    &lt;TD&gt;- the tStatCatcher component does not log the number of rows processed&lt;/TD&gt; 
   &lt;/TR&gt; 
  &lt;/TBODY&gt; 
 &lt;/TABLE&gt; 
&lt;/BLOCKQUOTE&gt; 
&lt;BR /&gt;That's right, this is the tFlowMeter + tFlowMeterCatcher job 
&lt;BR /&gt; 
&lt;BLOCKQUOTE&gt; 
 &lt;TABLE border="1"&gt; 
  &lt;TBODY&gt; 
   &lt;TR&gt; 
    &lt;TD&gt;- the tStatCatcher component does not log actual time spent in a component, just the total wallclock time from start to finish (which makes it much harder to debug which component needs optimization)&lt;/TD&gt; 
   &lt;/TR&gt; 
  &lt;/TBODY&gt; 
 &lt;/TABLE&gt; 
&lt;/BLOCKQUOTE&gt; 
&lt;BR /&gt;Yes, we know, it's due to our code generation model where part of components get together. We have added tChronometerStart/tChronometerStop to calculate the duration for a given component. I've posted a screenshot in this post to give an example on how to do this. 
&lt;BR /&gt;The result is: 
&lt;BR /&gt; 
&lt;PRE&gt;Starting job topic5996 at 14:00 31/03/2009.&lt;BR /&gt; tPerlFlex_1 duration 770 ms (0 seconds), 100000 runs, average : 7 microseconds, min : 6 microseconds, max: 1796 microseconds, speed: 129870 rows/second&lt;BR /&gt; tMysqlOutput_1 10721 ms (10 seconds), 100000 runs, average : 107 microseconds, min : 95 microseconds, max: 10112 microseconds, speed: 9327 rows/second&lt;BR /&gt;===&lt;BR /&gt;execution time: 28309 milliseconds&lt;BR /&gt;===&lt;BR /&gt;Job topic5996 ended at 14:00 31/03/2009. &lt;/PRE&gt; 
&lt;BR /&gt;and if I activate "extended inserts" in tMysqlOutput_1: 
&lt;BR /&gt; 
&lt;PRE&gt;Starting job topic5996 at 14:10 31/03/2009.&lt;BR /&gt; tPerlFlex_1 duration 705 ms (0 seconds), 100000 runs, average : 7 microseconds, min : 5 microseconds, max: 3548 microseconds, speed: 141843 rows/second&lt;BR /&gt; tMysqlOutput_1 1877 ms (1 second), 100000 runs, average : 18 microseconds, min : 6 microseconds, max: 13739 microseconds, speed: 53276 rows/second&lt;BR /&gt;===&lt;BR /&gt;execution time: 18152 milliseconds&lt;BR /&gt;===&lt;BR /&gt;Job topic5996 ended at 14:11 31/03/2009. &lt;/PRE&gt; 
&lt;BR /&gt;So it means that tRowGenerator takes approximately 16 seconds. 
&lt;BR /&gt; 
&lt;BR /&gt; 
&lt;BLOCKQUOTE&gt; 
 &lt;TABLE border="1"&gt; 
  &lt;TBODY&gt; 
   &lt;TR&gt; 
    &lt;TD&gt;- the tStatCatcher component does not allow the timestamp in UTC&lt;/TD&gt; 
   &lt;/TR&gt; 
  &lt;/TBODY&gt; 
 &lt;/TABLE&gt; 
&lt;/BLOCKQUOTE&gt; 
&lt;BR /&gt;You mean you want the timestamp to be timezone independant? I think it is perfectly right (should I say we made a design mistake?). What I can propose you is a "localtimeToUTCtime" routine to use right after tStatCatcher. 
&lt;BR /&gt;The problem with such a change is the existing data that has already been generated. 
&lt;BR /&gt; 
&lt;BLOCKQUOTE&gt; 
 &lt;TABLE border="1"&gt; 
  &lt;TBODY&gt; 
   &lt;TR&gt; 
    &lt;TD&gt;- And one more thing - do you think it would be a good idea to modify tStatCatcher to be able to "poll" data after certain amount of time and give the status of the job? Currently if some long-running component starts - there is no way to see the progress of the job. This may be a configurable parameter in IDE.&lt;/TD&gt; 
   &lt;/TR&gt; 
  &lt;/TBODY&gt; 
 &lt;/TABLE&gt; 
&lt;/BLOCKQUOTE&gt; 
&lt;BR /&gt;We have a feature that looks a bit like this with "statistics" in the Run view. For each row link, every 2 seconds, we send a message in a socket saying how many rows have already been processed. A long time ago, there was a thread (and not a fork) that was running in parallel of the main processing and sending a message in a socket every second. This was highly time consuming because there was many shared variables. 
&lt;BR /&gt;Anyway, your request would be perfectly possible. I mean adding a "running" status in addition to "begin" and "end". The problem would be in the reader. Here at Talend we propose 2 tools to read the data generated by tStatCatcher (Activity Monitoring Console and Talend Integration Suite Dashboard), and we have to check if this feature wouldn't break these tools.</description>
    <pubDate>Tue, 31 Mar 2009 13:55:02 GMT</pubDate>
    <dc:creator>Anonymous</dc:creator>
    <dc:date>2009-03-31T13:55:02Z</dc:date>
    <item>
      <title>tStatCatcher improvements</title>
      <link>https://community.qlik.com/t5/Talend-Studio/tStatCatcher-improvements/m-p/2326925#M96361</link>
      <description>- the tStatCatcher component does not log the number of rows processed 
&lt;BR /&gt;- the tStatCatcher component does not log actual time spent in a component, just the total wallclock time from start to finish (which makes it much harder to debug which component needs optimization) 
&lt;BR /&gt;- the tStatCatcher component does not allow the timestamp in UTC 
&lt;BR /&gt;- And one more thing - do you think it would be a good idea to modify tStatCatcher to be able to "poll" data after certain amount of time and give the status of the job? Currently if some long-running component starts - there is no way to see the progress of the job. This may be a configurable parameter in IDE. 
&lt;BR /&gt;Or maybe there are some other components that can help us to do this?</description>
      <pubDate>Sat, 16 Nov 2024 14:01:11 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/tStatCatcher-improvements/m-p/2326925#M96361</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2024-11-16T14:01:11Z</dc:date>
    </item>
    <item>
      <title>Re: tStatCatcher improvements</title>
      <link>https://community.qlik.com/t5/Talend-Studio/tStatCatcher-improvements/m-p/2326926#M96362</link>
      <description>&lt;BLOCKQUOTE&gt; 
 &lt;TABLE border="1"&gt; 
  &lt;TBODY&gt; 
   &lt;TR&gt; 
    &lt;TD&gt;- the tStatCatcher component does not log the number of rows processed&lt;/TD&gt; 
   &lt;/TR&gt; 
  &lt;/TBODY&gt; 
 &lt;/TABLE&gt; 
&lt;/BLOCKQUOTE&gt; 
&lt;BR /&gt;That's right, this is the tFlowMeter + tFlowMeterCatcher job 
&lt;BR /&gt; 
&lt;BLOCKQUOTE&gt; 
 &lt;TABLE border="1"&gt; 
  &lt;TBODY&gt; 
   &lt;TR&gt; 
    &lt;TD&gt;- the tStatCatcher component does not log actual time spent in a component, just the total wallclock time from start to finish (which makes it much harder to debug which component needs optimization)&lt;/TD&gt; 
   &lt;/TR&gt; 
  &lt;/TBODY&gt; 
 &lt;/TABLE&gt; 
&lt;/BLOCKQUOTE&gt; 
&lt;BR /&gt;Yes, we know, it's due to our code generation model where part of components get together. We have added tChronometerStart/tChronometerStop to calculate the duration for a given component. I've posted a screenshot in this post to give an example on how to do this. 
&lt;BR /&gt;The result is: 
&lt;BR /&gt; 
&lt;PRE&gt;Starting job topic5996 at 14:00 31/03/2009.&lt;BR /&gt; tPerlFlex_1 duration 770 ms (0 seconds), 100000 runs, average : 7 microseconds, min : 6 microseconds, max: 1796 microseconds, speed: 129870 rows/second&lt;BR /&gt; tMysqlOutput_1 10721 ms (10 seconds), 100000 runs, average : 107 microseconds, min : 95 microseconds, max: 10112 microseconds, speed: 9327 rows/second&lt;BR /&gt;===&lt;BR /&gt;execution time: 28309 milliseconds&lt;BR /&gt;===&lt;BR /&gt;Job topic5996 ended at 14:00 31/03/2009. &lt;/PRE&gt; 
&lt;BR /&gt;and if I activate "extended inserts" in tMysqlOutput_1: 
&lt;BR /&gt; 
&lt;PRE&gt;Starting job topic5996 at 14:10 31/03/2009.&lt;BR /&gt; tPerlFlex_1 duration 705 ms (0 seconds), 100000 runs, average : 7 microseconds, min : 5 microseconds, max: 3548 microseconds, speed: 141843 rows/second&lt;BR /&gt; tMysqlOutput_1 1877 ms (1 second), 100000 runs, average : 18 microseconds, min : 6 microseconds, max: 13739 microseconds, speed: 53276 rows/second&lt;BR /&gt;===&lt;BR /&gt;execution time: 18152 milliseconds&lt;BR /&gt;===&lt;BR /&gt;Job topic5996 ended at 14:11 31/03/2009. &lt;/PRE&gt; 
&lt;BR /&gt;So it means that tRowGenerator takes approximately 16 seconds. 
&lt;BR /&gt; 
&lt;BR /&gt; 
&lt;BLOCKQUOTE&gt; 
 &lt;TABLE border="1"&gt; 
  &lt;TBODY&gt; 
   &lt;TR&gt; 
    &lt;TD&gt;- the tStatCatcher component does not allow the timestamp in UTC&lt;/TD&gt; 
   &lt;/TR&gt; 
  &lt;/TBODY&gt; 
 &lt;/TABLE&gt; 
&lt;/BLOCKQUOTE&gt; 
&lt;BR /&gt;You mean you want the timestamp to be timezone independant? I think it is perfectly right (should I say we made a design mistake?). What I can propose you is a "localtimeToUTCtime" routine to use right after tStatCatcher. 
&lt;BR /&gt;The problem with such a change is the existing data that has already been generated. 
&lt;BR /&gt; 
&lt;BLOCKQUOTE&gt; 
 &lt;TABLE border="1"&gt; 
  &lt;TBODY&gt; 
   &lt;TR&gt; 
    &lt;TD&gt;- And one more thing - do you think it would be a good idea to modify tStatCatcher to be able to "poll" data after certain amount of time and give the status of the job? Currently if some long-running component starts - there is no way to see the progress of the job. This may be a configurable parameter in IDE.&lt;/TD&gt; 
   &lt;/TR&gt; 
  &lt;/TBODY&gt; 
 &lt;/TABLE&gt; 
&lt;/BLOCKQUOTE&gt; 
&lt;BR /&gt;We have a feature that looks a bit like this with "statistics" in the Run view. For each row link, every 2 seconds, we send a message in a socket saying how many rows have already been processed. A long time ago, there was a thread (and not a fork) that was running in parallel of the main processing and sending a message in a socket every second. This was highly time consuming because there was many shared variables. 
&lt;BR /&gt;Anyway, your request would be perfectly possible. I mean adding a "running" status in addition to "begin" and "end". The problem would be in the reader. Here at Talend we propose 2 tools to read the data generated by tStatCatcher (Activity Monitoring Console and Talend Integration Suite Dashboard), and we have to check if this feature wouldn't break these tools.</description>
      <pubDate>Tue, 31 Mar 2009 13:55:02 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/tStatCatcher-improvements/m-p/2326926#M96362</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2009-03-31T13:55:02Z</dc:date>
    </item>
    <item>
      <title>Re: tStatCatcher improvements</title>
      <link>https://community.qlik.com/t5/Talend-Studio/tStatCatcher-improvements/m-p/2326927#M96363</link>
      <description>Hi Mister Penguin, 
&lt;BR /&gt;it's really great that you're talking about the Chronometer components because I precisely encounter some difficulties to use them as you describe. 
&lt;BR /&gt;I don't want to duplicate posts (6027) but in fact, in your example you have a tChronometerStart following a tGenerator, and I am not able to create a such connection between these 2 components on my TOS version, my TOS simply doesn't allow to do that. 
&lt;BR /&gt;I'm using TOS 3.0.3 (Build id: r21383-20090126-2207) under Windows Vista, in Java mode. (you are apparently in Perl, maybe that's the key...) 
&lt;BR /&gt;Another point: I don't have so detailed informations using a tChronometer stop (I don't have min, max, average, records rate...) 
&lt;BR /&gt;Talend spirits, please take time to answer to this post, my 3 previous posts seem to be invisible on this forum 
&lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009MPcz.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/157233iD1A564EF62DE3BC2/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009MPcz.png" alt="0683p000009MPcz.png" /&gt;&lt;/span&gt; 
&lt;BR /&gt;Regards,</description>
      <pubDate>Tue, 31 Mar 2009 14:52:44 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/tStatCatcher-improvements/m-p/2326927#M96363</guid>
      <dc:creator>_AnonymousUser</dc:creator>
      <dc:date>2009-03-31T14:52:44Z</dc:date>
    </item>
    <item>
      <title>Re: tStatCatcher improvements</title>
      <link>https://community.qlik.com/t5/Talend-Studio/tStatCatcher-improvements/m-p/2326928#M96364</link>
      <description>The link is wrong in my previous post, that's the first time i'm creating one, I'll try like this: 6027&lt;BR /&gt;Regards.</description>
      <pubDate>Tue, 31 Mar 2009 14:56:22 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/tStatCatcher-improvements/m-p/2326928#M96364</guid>
      <dc:creator>_AnonymousUser</dc:creator>
      <dc:date>2009-03-31T14:56:22Z</dc:date>
    </item>
    <item>
      <title>Re: tStatCatcher improvements</title>
      <link>https://community.qlik.com/t5/Talend-Studio/tStatCatcher-improvements/m-p/2326929#M96365</link>
      <description>&lt;BLOCKQUOTE&gt; 
 &lt;TABLE border="1"&gt; 
  &lt;TBODY&gt; 
   &lt;TR&gt; 
    &lt;TD&gt;That's right, this is the tFlowMeter + tFlowMeterCatcher job&lt;BR /&gt;...&lt;/TD&gt; 
   &lt;/TR&gt; 
  &lt;/TBODY&gt; 
 &lt;/TABLE&gt; 
&lt;/BLOCKQUOTE&gt; 
&lt;BR /&gt;Thank you for your detailed reply. 
&lt;BR /&gt;I was just wondering - why do we have so many components when we can have all functionality in one component? I want times, number of rows processed and polling all-in one - so i know if I added my tStatCatcher component - I'll have all necessary information for debugging and fine tuning my application. 
&lt;BR /&gt;As for the timer component - It's good for tuning the application while you develop it - but the behavior of the script might change when the size of the input/output data grows. So I pretty much need times for eaxh and every component in my graph. Can we add timer functionality for all components? Or is it hard to implement?</description>
      <pubDate>Tue, 31 Mar 2009 15:08:53 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/tStatCatcher-improvements/m-p/2326929#M96365</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2009-03-31T15:08:53Z</dc:date>
    </item>
  </channel>
</rss>

