<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: tPostgresqlInput Query Slow Through Talend, Yet Fast When Running on DB Visualizer in Talend Studio</title>
    <link>https://community.qlik.com/t5/Talend-Studio/tPostgresqlInput-Query-Slow-Through-Talend-Yet-Fast-When-Running/m-p/2325987#M95517</link>
    <description>&lt;P&gt;Sabrina,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I am using Enterprise Edition, version is 6.4.1.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The job will run successfully, but when I run it for larger datasets, it is exponentially slower.&amp;nbsp; For example, a 5,000 record dataset takes 2 seconds to run.&amp;nbsp; A 25,000 record dataset takes 5 minutes to run.&amp;nbsp; A 100,000 record dataset has taken over 3 hours to run.&amp;nbsp; When I run the input query using PG Admin or DB Visualizer, the query takes less than a second.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Using different logging options, I can see the component queryCircRef is the component taking a long time to run.&amp;nbsp; I have also tried creating a custom function at the database level to run, but see the same results.&amp;nbsp; Running the custom function at the db level is quick, but run it on Talend and it is super slow.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thanks for your help!&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;BR /&gt;&lt;A href="https://community.qlik.com/legacyfs/online/tlnd_dw_files/0683p000009LsVQ"&gt;Long Running Job.PNG&lt;/A&gt;</description>
    <pubDate>Mon, 02 Apr 2018 16:00:46 GMT</pubDate>
    <dc:creator>Nickster19</dc:creator>
    <dc:date>2018-04-02T16:00:46Z</dc:date>
    <item>
      <title>tPostgresqlInput Query Slow Through Talend, Yet Fast When Running on DB Visualizer</title>
      <link>https://community.qlik.com/t5/Talend-Studio/tPostgresqlInput-Query-Slow-Through-Talend-Yet-Fast-When-Running/m-p/2325985#M95515</link>
      <description>&lt;P&gt;Hey all,&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;I have a tough question about one of my components that is not scaling well.&amp;nbsp; I am running a query using the tPostgresqlInput component.&amp;nbsp; The query, shown below, checks for circular references between an employee and the manager in a single table.&amp;nbsp; For testing purposes, I have been using a 25,000 record dataset.&amp;nbsp; When I run this query using PG Admin or DB Visualizer, it will take less than 2 seconds to run.&amp;nbsp; Additionally, using SAP BODS (our current ETL tool) the query will also run in less than 2 seconds.&amp;nbsp; However, when I run this query through tPostgresqlInput it will take over 5 minutes to run.&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;I have tried a few different things to speed this query up, but to no avail.&amp;nbsp; Inside Talend, I have tried adding additional memory on the JVM and running the subjob in parallel.&amp;nbsp; Comparing the run time of the query between Talend and at the database level proves that their is something hampering the performance when ran through Talend.&amp;nbsp; Any help understanding why Talend struggles with this query would be greatly appreciated.&amp;nbsp; We need to be able to scale this query to process up to 500,000 records.&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;Thank you for any help!&amp;nbsp; I truly appreciate it!&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;Running Job In DB Visualizer: 0.003 seconds&lt;/P&gt; 
&lt;P&gt;Running Query in Talend: 346 seconds&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;Query:&lt;/P&gt; 
&lt;P&gt;WITH RECURSIVE circular_managers(unique_id, mgr_unique_id, depth, path, cycle) AS (&lt;BR /&gt;SELECT u.unique_id, u.mgr_unique_id, 1,&lt;BR /&gt;ARRAY[u.unique_id || '']::varchar[],&lt;BR /&gt;false&lt;BR /&gt;FROM table1 u&lt;BR /&gt;WHERE u.unique_id IS NOT NULL&lt;BR /&gt;UNION ALL&lt;BR /&gt;SELECT u.unique_id, u.mgr_unique_id, cm.depth + 1,&lt;BR /&gt;path || u.unique_id,&lt;BR /&gt;u.unique_id = ANY(path)&lt;BR /&gt;FROM table1 u, circular_managers cm&lt;BR /&gt;WHERE u.unique_id = cm.mgr_unique_id&lt;BR /&gt;AND u.unique_id IS NOT NULL&lt;BR /&gt;AND NOT cycle&lt;BR /&gt;)&lt;BR /&gt;SELECT&lt;BR /&gt;depth,&lt;BR /&gt;array_to_string(path, ' &amp;gt; ') circular_managers&lt;BR /&gt;FROM circular_managers&lt;BR /&gt;WHERE cycle&lt;BR /&gt;AND array_to_string(path, ' &amp;gt; ') NOT LIKE ' &amp;gt;%'&lt;BR /&gt;AND array_to_string(path, ' &amp;gt; ') NOT LIKE '% &amp;gt; &amp;gt; %'&lt;BR /&gt;AND path[1] = path[array_upper(path, 1)]&lt;BR /&gt;group by 1,2&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 21 Mar 2018 15:59:15 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/tPostgresqlInput-Query-Slow-Through-Talend-Yet-Fast-When-Running/m-p/2325985#M95515</guid>
      <dc:creator>Nickster19</dc:creator>
      <dc:date>2018-03-21T15:59:15Z</dc:date>
    </item>
    <item>
      <title>Re: tPostgresqlInput Query Slow Through Talend, Yet Fast When Running on DB Visualizer</title>
      <link>https://community.qlik.com/t5/Talend-Studio/tPostgresqlInput-Query-Slow-Through-Talend-Yet-Fast-When-Running/m-p/2325986#M95516</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;What does your ETL job look like? Are you able to run your job successfully in talend studio?&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;&amp;nbsp;Could you please clarify in which Talend version/edition you are?&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;Best regards&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;Sabrina&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 28 Mar 2018 09:08:47 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/tPostgresqlInput-Query-Slow-Through-Talend-Yet-Fast-When-Running/m-p/2325986#M95516</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2018-03-28T09:08:47Z</dc:date>
    </item>
    <item>
      <title>Re: tPostgresqlInput Query Slow Through Talend, Yet Fast When Running on DB Visualizer</title>
      <link>https://community.qlik.com/t5/Talend-Studio/tPostgresqlInput-Query-Slow-Through-Talend-Yet-Fast-When-Running/m-p/2325987#M95517</link>
      <description>&lt;P&gt;Sabrina,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I am using Enterprise Edition, version is 6.4.1.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The job will run successfully, but when I run it for larger datasets, it is exponentially slower.&amp;nbsp; For example, a 5,000 record dataset takes 2 seconds to run.&amp;nbsp; A 25,000 record dataset takes 5 minutes to run.&amp;nbsp; A 100,000 record dataset has taken over 3 hours to run.&amp;nbsp; When I run the input query using PG Admin or DB Visualizer, the query takes less than a second.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Using different logging options, I can see the component queryCircRef is the component taking a long time to run.&amp;nbsp; I have also tried creating a custom function at the database level to run, but see the same results.&amp;nbsp; Running the custom function at the db level is quick, but run it on Talend and it is super slow.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thanks for your help!&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;BR /&gt;&lt;A href="https://community.qlik.com/legacyfs/online/tlnd_dw_files/0683p000009LsVQ"&gt;Long Running Job.PNG&lt;/A&gt;</description>
      <pubDate>Mon, 02 Apr 2018 16:00:46 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/tPostgresqlInput-Query-Slow-Through-Talend-Yet-Fast-When-Running/m-p/2325987#M95517</guid>
      <dc:creator>Nickster19</dc:creator>
      <dc:date>2018-04-02T16:00:46Z</dc:date>
    </item>
    <item>
      <title>Re: tPostgresqlInput Query Slow Through Talend, Yet Fast When Running on DB Visualizer</title>
      <link>https://community.qlik.com/t5/Talend-Studio/tPostgresqlInput-Query-Slow-Through-Talend-Yet-Fast-When-Running/m-p/2325988#M95518</link>
      <description>&lt;P&gt;&lt;SPAN class=""&gt;Nickster19, Were you able to find any solution to the performance issue? We are having the same issue&amp;nbsp;extracting data from PostgreSQL database. After enabling parallelization at subjob level and increasing JVM -Xmx to almost 18 GB, the job still runs at 70 rows/sec for each thread - which comes to around 350 rows/second.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN class=""&gt;Thanks&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN class=""&gt;Raj&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 21 Jun 2018 19:39:25 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/tPostgresqlInput-Query-Slow-Through-Talend-Yet-Fast-When-Running/m-p/2325988#M95518</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2018-06-21T19:39:25Z</dc:date>
    </item>
  </channel>
</rss>

