<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Slow Insertion in Amazon Redshift in Talend Studio</title>
    <link>https://community.qlik.com/t5/Talend-Studio/Slow-Insertion-in-Amazon-Redshift/m-p/2317099#M87584</link>
    <description>All,
&lt;BR /&gt;I have reported this behaviour on jira, our R&amp;amp;D team will investigate.
&lt;BR /&gt;The issue url is: 
&lt;A href="https://jira.talendforge.org/browse/TDI-26155" rel="nofollow noopener noreferrer"&gt;https://jira.talendforge.org/browse/TDI-26155&lt;/A&gt;
&lt;BR /&gt;Regards,</description>
    <pubDate>Thu, 23 May 2013 18:09:01 GMT</pubDate>
    <dc:creator>Anonymous</dc:creator>
    <dc:date>2013-05-23T18:09:01Z</dc:date>
    <item>
      <title>Slow Insertion in Amazon Redshift</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Slow-Insertion-in-Amazon-Redshift/m-p/2317087#M87572</link>
      <description>Hi, 
&lt;BR /&gt;We have just created a simple job to fetch data from MySQL table (Local database and from Amazon RDS), having rows 300,000 and to insert these rows into Redshift. It took us more than 4 hours to do that. 
&lt;BR /&gt;1. Why is it very slow to fetch data from one single table and to insert it in Amazon Redshift using Talend OpenStudio Big data? 
&lt;BR /&gt;2. Is there a way to do a fast insertion? where it should insert it in less than 5 minutes? 
&lt;BR /&gt;Please find the attached screenshots for details. 
&lt;BR /&gt;thanks! 
&lt;BR /&gt; 
&lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009MDkY.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/142882iDAADE977DE54368A/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009MDkY.png" alt="0683p000009MDkY.png" /&gt;&lt;/span&gt; 
&lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009MEOU.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/136977i4F20573065341F6D/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009MEOU.png" alt="0683p000009MEOU.png" /&gt;&lt;/span&gt;</description>
      <pubDate>Sat, 16 Nov 2024 12:02:23 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Slow-Insertion-in-Amazon-Redshift/m-p/2317087#M87572</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2024-11-16T12:02:23Z</dc:date>
    </item>
    <item>
      <title>Re: Slow Insertion in Amazon Redshift</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Slow-Insertion-in-Amazon-Redshift/m-p/2317088#M87573</link>
      <description>Hi, &lt;BR /&gt;Do you set the "Commit every" in &lt;A href="https://help.talend.com/search/all?query=tRedshiftOutput&amp;amp;content-lang=en" target="_blank" rel="nofollow noopener noreferrer"&gt;tRedshiftOutput&lt;/A&gt; and is there any complicate Sql query in your input component? What your current rate?&lt;BR /&gt;Best reards&lt;BR /&gt;Sabrina</description>
      <pubDate>Tue, 07 May 2013 04:34:31 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Slow-Insertion-in-Amazon-Redshift/m-p/2317088#M87573</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2013-05-07T04:34:31Z</dc:date>
    </item>
    <item>
      <title>Re: Slow Insertion in Amazon Redshift</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Slow-Insertion-in-Amazon-Redshift/m-p/2317089#M87574</link>
      <description>Hi,
&lt;BR /&gt;Yes Commit every is set to 10,000. There is no complicated query, it is rather a simple query given below:
&lt;BR /&gt;"select 
&lt;BR /&gt; `dim_ipdata_id` ,
&lt;BR /&gt; `ipdata_ip` ,
&lt;BR /&gt; `ipdata_isp` ,
&lt;BR /&gt; `ipdata_org` ,
&lt;BR /&gt; `ipdata_country` ,
&lt;BR /&gt; `ipdata_city` ,
&lt;BR /&gt; `ipdata_postal_code` ,
&lt;BR /&gt; `ipdata_longitude` ,
&lt;BR /&gt; `ipdata_latitude` ,
&lt;BR /&gt; `ipdata_area_code` ,
&lt;BR /&gt; `ipdata_metro_code` ,
&lt;BR /&gt; `ipdata_category` 
&lt;BR /&gt; from dim_ipdata"
&lt;BR /&gt;Current rate is 8 rows per second. Any idea what could go wrong there?</description>
      <pubDate>Tue, 07 May 2013 07:53:00 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Slow-Insertion-in-Amazon-Redshift/m-p/2317089#M87574</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2013-05-07T07:53:00Z</dc:date>
    </item>
    <item>
      <title>Re: Slow Insertion in Amazon Redshift</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Slow-Insertion-in-Amazon-Redshift/m-p/2317090#M87575</link>
      <description>Hi, 
&lt;BR /&gt; 8 rows per second is not a normal rate. I have seen your screenshot and found that tMap component is only used to map data without other action. For a large data, the tMap component consume too much memory. How about removing it and the work flow should be : tAmazonMysqlInput--&amp;gt;tRedshiftoutput or storing the data on disk instead of memory?
&lt;BR /&gt;Best regards
&lt;BR /&gt;Sabrina</description>
      <pubDate>Tue, 07 May 2013 09:01:33 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Slow-Insertion-in-Amazon-Redshift/m-p/2317090#M87575</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2013-05-07T09:01:33Z</dc:date>
    </item>
    <item>
      <title>Re: Slow Insertion in Amazon Redshift</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Slow-Insertion-in-Amazon-Redshift/m-p/2317091#M87576</link>
      <description>Hi,
&lt;BR /&gt;Thanks for the valuable suggestions. 
&lt;BR /&gt;I removed tMap to make it like: tAmazonMysqlInput--&amp;gt;tRedshiftoutput , but even that didn't help. Regarding your second suggestion of storing data on disk, I still would have to use tMap for that or do we have other alternative? 
&lt;BR /&gt;Thanks!
&lt;BR /&gt;Best Regards,
&lt;BR /&gt;Ilyas</description>
      <pubDate>Tue, 07 May 2013 09:39:13 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Slow-Insertion-in-Amazon-Redshift/m-p/2317091#M87576</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2013-05-07T09:39:13Z</dc:date>
    </item>
    <item>
      <title>Re: Slow Insertion in Amazon Redshift</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Slow-Insertion-in-Amazon-Redshift/m-p/2317092#M87577</link>
      <description>Hi, 
&lt;BR /&gt;I don't think the second suggestion does work for you, because when you have removed tMap, there is no any help, which means tMap is not the block one.
&lt;BR /&gt;For the "Commit every" option, is there any good news if you change the value "10,000"? It depends on your database, and each time submit will consume DB server resources. 
&lt;BR /&gt;In addition, resource is not the same for different database servers, so there is no fixed standard. 
&lt;BR /&gt;Best regards
&lt;BR /&gt;Sabrina</description>
      <pubDate>Tue, 07 May 2013 10:17:27 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Slow-Insertion-in-Amazon-Redshift/m-p/2317092#M87577</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2013-05-07T10:17:27Z</dc:date>
    </item>
    <item>
      <title>Re: Slow Insertion in Amazon Redshift</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Slow-Insertion-in-Amazon-Redshift/m-p/2317093#M87578</link>
      <description>Hi, 
&lt;BR /&gt;Yes but even removing tMap may not help us in a long run, because that's the one will allow us to manipulate strings, urls, joins, variables, etc.. For now we are just testing Talend for Redshift by applying simplest possible data transformation. 
&lt;BR /&gt;I've changed "Commit every" from 10000 to 1000 and still no luck! 
&lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009MPcz.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/157233iD1A564EF62DE3BC2/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009MPcz.png" alt="0683p000009MPcz.png" /&gt;&lt;/span&gt; 
&lt;BR /&gt;I used tRedshiftConnection to setup connection and then set tRedshiftInput's connection to "Use existing connection" where tRedshiftConnection was set as ref there. But that keep giving me a NullPointerException, so I had to provide all the connection details inside tRedshiftInput, so it doesn't use "Use existing connection" anymore, could that be a problem? 
&lt;BR /&gt;Best Regards, 
&lt;BR /&gt;Ilyas</description>
      <pubDate>Tue, 07 May 2013 10:39:58 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Slow-Insertion-in-Amazon-Redshift/m-p/2317093#M87578</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2013-05-07T10:39:58Z</dc:date>
    </item>
    <item>
      <title>Re: Slow Insertion in Amazon Redshift</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Slow-Insertion-in-Amazon-Redshift/m-p/2317094#M87579</link>
      <description>Hi, &lt;BR /&gt;To be honest, it is a very new component and I'm building a testing environment for it to see if I can get the same issue as yours. I'll come back to you asap, sorry for the inconvenience.&lt;BR /&gt;Best regards&lt;BR /&gt;Sabrina</description>
      <pubDate>Wed, 08 May 2013 09:02:44 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Slow-Insertion-in-Amazon-Redshift/m-p/2317094#M87579</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2013-05-08T09:02:44Z</dc:date>
    </item>
    <item>
      <title>Re: Slow Insertion in Amazon Redshift</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Slow-Insertion-in-Amazon-Redshift/m-p/2317095#M87580</link>
      <description>&lt;BLOCKQUOTE&gt;
 &lt;TABLE border="1"&gt;
  &lt;TBODY&gt;
   &lt;TR&gt;
    &lt;TD&gt;Hi, &lt;BR /&gt;To be honest, it is a very new component and I'm building a testing environment for it to see if I can get the same issue as yours. I'll come back to you asap, sorry for the inconvenience.&lt;BR /&gt;Best regards&lt;BR /&gt;Sabrina&lt;/TD&gt;
   &lt;/TR&gt;
  &lt;/TBODY&gt;
 &lt;/TABLE&gt;
&lt;/BLOCKQUOTE&gt;
&lt;BR /&gt;Hi,
&lt;BR /&gt;I've got exactly the same problem. So is there any solution? Insert is too slow when inserting in Redshift.
&lt;BR /&gt;Best Regards,
&lt;BR /&gt;~Sergejs</description>
      <pubDate>Fri, 10 May 2013 10:00:46 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Slow-Insertion-in-Amazon-Redshift/m-p/2317095#M87580</guid>
      <dc:creator>_AnonymousUser</dc:creator>
      <dc:date>2013-05-10T10:00:46Z</dc:date>
    </item>
    <item>
      <title>Re: Slow Insertion in Amazon Redshift</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Slow-Insertion-in-Amazon-Redshift/m-p/2317096#M87581</link>
      <description>Hi there,&lt;BR /&gt;Wwe have the same problem, 4 or 8 rows per second.&lt;BR /&gt;We tested different sources as mysql and postgresql but still the same problem.&lt;BR /&gt;next, we tried with talend Open Studio for data integration and Talend for big data, but still that problem.&lt;BR /&gt;Also, we tried with a postgresql bulk insert, but it had an error like this:&lt;BR /&gt;"Exception in component tPostgresqlOutputBulkExec_1_tPBE&lt;BR /&gt;org.postgresql.util.PSQLException: ERROR: COPY CSV is not supported"&lt;BR /&gt;Any help, please?&lt;BR /&gt;Thanks!&lt;BR /&gt;Leo</description>
      <pubDate>Fri, 10 May 2013 19:34:40 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Slow-Insertion-in-Amazon-Redshift/m-p/2317096#M87581</guid>
      <dc:creator>_AnonymousUser</dc:creator>
      <dc:date>2013-05-10T19:34:40Z</dc:date>
    </item>
    <item>
      <title>Re: Slow Insertion in Amazon Redshift</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Slow-Insertion-in-Amazon-Redshift/m-p/2317097#M87582</link>
      <description>Hello, 
&lt;BR /&gt;We are facing the same problem too. Our Mysql database is installed on amazon EC2 (which is on the same region as of our Redshift instance). 
&lt;BR /&gt;I have set the "Commit every" option to 10000 in tRedshiftoutput component and not using any tMap component. Also it is a plain select statement from Mysql. 
&lt;BR /&gt;For 10300 rows (Table size is just about 10 MB in Mysql) it took about 7-8 min and for 440000 rows(about 50 MB in size) it took about 7 hours. 
&lt;BR /&gt;I have tried using jdbc-output component as well, but it dint make any difference. 
&lt;BR /&gt;Any solution for increasing the performance while using the Redshift component? 
&lt;BR /&gt;Right now the best way I am finding is writing the output to a flat file then upload it to an S3 bucket and use copy command to load to Redshift. This approach is taking less than a minute for the whole thing but is not very convenient and also requires some external script. 
&lt;BR /&gt;Thanks 
&lt;BR /&gt;Aditya</description>
      <pubDate>Thu, 23 May 2013 07:15:50 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Slow-Insertion-in-Amazon-Redshift/m-p/2317097#M87582</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2013-05-23T07:15:50Z</dc:date>
    </item>
    <item>
      <title>Re: Slow Insertion in Amazon Redshift</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Slow-Insertion-in-Amazon-Redshift/m-p/2317098#M87583</link>
      <description>Hi Aditya, 
&lt;BR /&gt;It is appreciated that open a JIRA issue in the Talend DI project of the 
&lt;A href="https://jira.talendforge.org/secure/Dashboard.jspa" target="_blank" rel="nofollow noopener noreferrer"&gt;JIRA bugtracker&lt;/A&gt;. Our developers will see if it is a bug and give a solution. 
&lt;BR /&gt;Post the jira issue link on forum to let others community user know it. 
&lt;BR /&gt;Best regards 
&lt;BR /&gt;Sabrina</description>
      <pubDate>Thu, 23 May 2013 08:34:24 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Slow-Insertion-in-Amazon-Redshift/m-p/2317098#M87583</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2013-05-23T08:34:24Z</dc:date>
    </item>
    <item>
      <title>Re: Slow Insertion in Amazon Redshift</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Slow-Insertion-in-Amazon-Redshift/m-p/2317099#M87584</link>
      <description>All,
&lt;BR /&gt;I have reported this behaviour on jira, our R&amp;amp;D team will investigate.
&lt;BR /&gt;The issue url is: 
&lt;A href="https://jira.talendforge.org/browse/TDI-26155" rel="nofollow noopener noreferrer"&gt;https://jira.talendforge.org/browse/TDI-26155&lt;/A&gt;
&lt;BR /&gt;Regards,</description>
      <pubDate>Thu, 23 May 2013 18:09:01 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Slow-Insertion-in-Amazon-Redshift/m-p/2317099#M87584</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2013-05-23T18:09:01Z</dc:date>
    </item>
    <item>
      <title>Re: Slow Insertion in Amazon Redshift</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Slow-Insertion-in-Amazon-Redshift/m-p/2317100#M87585</link>
      <description>Hi All, 
&lt;BR /&gt;Please vote for the jira issue 
&lt;A href="https://jira.talendforge.org/browse/TDI-26155" rel="nofollow noopener noreferrer"&gt;https://jira.talendforge.org/browse/TDI-26155&lt;/A&gt; created by adiallo and adding your comments into it. 
&lt;BR /&gt;Best regards 
&lt;BR /&gt;Sabrina</description>
      <pubDate>Fri, 24 May 2013 04:21:24 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Slow-Insertion-in-Amazon-Redshift/m-p/2317100#M87585</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2013-05-24T04:21:24Z</dc:date>
    </item>
    <item>
      <title>Re: Slow Insertion in Amazon Redshift</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Slow-Insertion-in-Amazon-Redshift/m-p/2317101#M87586</link>
      <description>Hi,
&lt;BR /&gt;The current component is using single INSERT statement in order to write into Redshift. This way of doing is totally inefficient according to the Redshift documentation and best pratices.
&lt;BR /&gt;There are several ways to fix this issue. One of them is the COPY command to load data file which are located on S3 or DynamoDB. You could use this command with the tRedshiftRow component. Another one is the multiple insert, which is going to be implemented by the R&amp;amp;D in the TDI-26155.
&lt;BR /&gt;Rémy.</description>
      <pubDate>Fri, 24 May 2013 14:24:55 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Slow-Insertion-in-Amazon-Redshift/m-p/2317101#M87586</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2013-05-24T14:24:55Z</dc:date>
    </item>
    <item>
      <title>Re: Slow Insertion in Amazon Redshift</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Slow-Insertion-in-Amazon-Redshift/m-p/2317102#M87587</link>
      <description>Hi,&lt;BR /&gt;Are there any improvements, in Redshift components, have been made so far with the new version of Talend BD 5.4?&lt;BR /&gt;BR!</description>
      <pubDate>Thu, 28 Nov 2013 14:08:10 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Slow-Insertion-in-Amazon-Redshift/m-p/2317102#M87587</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2013-11-28T14:08:10Z</dc:date>
    </item>
    <item>
      <title>Re: Slow Insertion in Amazon Redshift</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Slow-Insertion-in-Amazon-Redshift/m-p/2317103#M87588</link>
      <description>Any news on this? I am interested in using Talend to ETL into Redshift from mysql..&lt;BR /&gt;i have gotten much faster performance by using Talend to pump out files to S3 then using Amazon tools to pipe them to redshift. &amp;nbsp;the issues was that large files still took a while and lots of IO happening to go to file then up to cloud. &amp;nbsp;One can use Amazon's data pipleline i suppose.. but we lose the rich features of talend transformations...</description>
      <pubDate>Tue, 07 Oct 2014 15:13:26 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Slow-Insertion-in-Amazon-Redshift/m-p/2317103#M87588</guid>
      <dc:creator>hson</dc:creator>
      <dc:date>2014-10-07T15:13:26Z</dc:date>
    </item>
    <item>
      <title>Re: Slow Insertion in Amazon Redshift</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Slow-Insertion-in-Amazon-Redshift/m-p/2317104#M87589</link>
      <description>I think these connectors don't have the BULK feature. On the input you're not able to set a Cursor Size on the output you're not able to set a Batch size. Try to use Regular MySQL/PostgreSQL components, which do have these features.&lt;BR /&gt;We had something similar with Greenplum.</description>
      <pubDate>Wed, 25 Mar 2015 12:35:04 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Slow-Insertion-in-Amazon-Redshift/m-p/2317104#M87589</guid>
      <dc:creator>Dezzsoke</dc:creator>
      <dc:date>2015-03-25T12:35:04Z</dc:date>
    </item>
    <item>
      <title>Re: Slow Insertion in Amazon Redshift</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Slow-Insertion-in-Amazon-Redshift/m-p/2317105#M87590</link>
      <description>Now that the bulk feature exists - how do we connect to it?</description>
      <pubDate>Thu, 29 Oct 2015 18:09:31 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Slow-Insertion-in-Amazon-Redshift/m-p/2317105#M87590</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2015-10-29T18:09:31Z</dc:date>
    </item>
    <item>
      <title>Re: Slow Insertion in Amazon Redshift</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Slow-Insertion-in-Amazon-Redshift/m-p/2317106#M87591</link>
      <description>Hi.. did anyone find a solution for this, i am facing the same problem&lt;BR /&gt;Reading the data from MySQL and loading to Redshift, but the jobs are too slow....</description>
      <pubDate>Tue, 28 Jun 2016 07:44:15 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Slow-Insertion-in-Amazon-Redshift/m-p/2317106#M87591</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2016-06-28T07:44:15Z</dc:date>
    </item>
  </channel>
</rss>

