<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: S3 to Redshift issues. in Talend Studio</title>
    <link>https://community.qlik.com/t5/Talend-Studio/S3-to-Redshift-issues/m-p/2278008#M53602</link>
    <description>&lt;BLOCKQUOTE&gt;
 &lt;TABLE border="1"&gt;
  &lt;TBODY&gt;
   &lt;TR&gt;
    &lt;TD&gt;I am trying to move data from several files in S3 bucket into Redshift tables. I have tried several options for this and wanted to know the best approach that I can use for this.&lt;BR /&gt;Note that I can copy the files from S3 to Redshift tables without talend and it works great. However, wanted to use Talend as I will be using this for incremental loads (without dropping/clearing the tables).&lt;BR /&gt;I have tried the following options.&lt;BR /&gt;1. tS3List --&amp;gt; tRedshiftBulkExec &amp;nbsp; &amp;nbsp;: fails on delimiter not found. Still trying to identify why this issue is occurring. Copy is working.&lt;BR /&gt;- tRedshiftBulkExec requires me to reenter my S3 credentials. This is kind of crazy as I already have a S3 connection and it would be great it the component can just use an existing connection.&lt;BR /&gt;2. tRedshiftRow &amp;nbsp; &amp;nbsp;: I enter the copy command here and it works. However this is not ideal when using an ETL tool. I can as well just use basic scripting. Ideally #1 should work.&lt;BR /&gt;3. S3 --&amp;gt; tS3Get (writes to local, ugh!!) --&amp;gt; tFileInput (from local) --&amp;gt; tRedshiftOutput &amp;nbsp; &amp;nbsp;(defeats the purpose of S3 to Redshift).&lt;BR /&gt;What is the best way to accomplish this using Talend OS for Big Data ver 6.1.1?&amp;nbsp;&lt;/TD&gt;
   &lt;/TR&gt;
  &lt;/TBODY&gt;
 &lt;/TABLE&gt;
&lt;/BLOCKQUOTE&gt;</description>
    <pubDate>Fri, 17 Feb 2017 05:12:10 GMT</pubDate>
    <dc:creator>_AnonymousUser</dc:creator>
    <dc:date>2017-02-17T05:12:10Z</dc:date>
    <item>
      <title>S3 to Redshift issues.</title>
      <link>https://community.qlik.com/t5/Talend-Studio/S3-to-Redshift-issues/m-p/2278006#M53600</link>
      <description>I am trying to move data from several files in S3 bucket into Redshift tables. I have tried several options for this and wanted to know the best approach that I can use for this.
&lt;BR /&gt;Note that I can copy the files from S3 to Redshift tables without talend and it works great. However, wanted to use Talend as I will be using this for incremental loads (without dropping/clearing the tables).
&lt;BR /&gt;I have tried the following options.
&lt;BR /&gt;1. tS3List --&amp;gt; tRedshiftBulkExec &amp;nbsp; &amp;nbsp;: fails on delimiter not found. Still trying to identify why this issue is occurring. Copy is working.
&lt;BR /&gt;- tRedshiftBulkExec requires me to reenter my S3 credentials. This is kind of crazy as I already have a S3 connection and it would be great it the component can just use an existing connection.
&lt;BR /&gt;2. tRedshiftRow &amp;nbsp; &amp;nbsp;: I enter the copy command here and it works. However this is not ideal when using an ETL tool. I can as well just use basic scripting. Ideally #1 should work.
&lt;BR /&gt;3. S3 --&amp;gt; tS3Get (writes to local, ugh!!) --&amp;gt; tFileInput (from local) --&amp;gt; tRedshiftOutput &amp;nbsp; &amp;nbsp;(defeats the purpose of S3 to Redshift).
&lt;BR /&gt;What is the best way to accomplish this using Talend OS for Big Data ver 6.1.1?&amp;nbsp;</description>
      <pubDate>Sat, 16 Nov 2024 10:50:59 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/S3-to-Redshift-issues/m-p/2278006#M53600</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2024-11-16T10:50:59Z</dc:date>
    </item>
    <item>
      <title>Re: S3 to Redshift issues.</title>
      <link>https://community.qlik.com/t5/Talend-Studio/S3-to-Redshift-issues/m-p/2278007#M53601</link>
      <description>tRedshiftRow is the best way.&amp;nbsp; While it is "not ideal when using an ETL tool" is it ideal for RedShift.&amp;nbsp; Running files through Talend adds a unnecessary step, and you may miss the parallelism in a direct S3 -&amp;gt; Redshift copy.&amp;nbsp; 
&lt;BR /&gt;Also keep in mind that a Redshift update is a delete / insert and those deleted rows will stick around until you vacuum the table.&amp;nbsp; Consider adding a tRedshiftRow for the vacuum to avoid performance degradation. 
&lt;BR /&gt;A complete process may look like: 
&lt;BR /&gt;s3 -&amp;gt; staging -&amp;gt; update -&amp;gt; vacuum 
&lt;BR /&gt;If you haven't already, I suggest testing whether a truncate / insert process is faster than a update as the RedShift architecture is optimized for bulk processing.&amp;nbsp;</description>
      <pubDate>Wed, 27 Jan 2016 11:38:28 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/S3-to-Redshift-issues/m-p/2278007#M53601</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2016-01-27T11:38:28Z</dc:date>
    </item>
    <item>
      <title>Re: S3 to Redshift issues.</title>
      <link>https://community.qlik.com/t5/Talend-Studio/S3-to-Redshift-issues/m-p/2278008#M53602</link>
      <description>&lt;BLOCKQUOTE&gt;
 &lt;TABLE border="1"&gt;
  &lt;TBODY&gt;
   &lt;TR&gt;
    &lt;TD&gt;I am trying to move data from several files in S3 bucket into Redshift tables. I have tried several options for this and wanted to know the best approach that I can use for this.&lt;BR /&gt;Note that I can copy the files from S3 to Redshift tables without talend and it works great. However, wanted to use Talend as I will be using this for incremental loads (without dropping/clearing the tables).&lt;BR /&gt;I have tried the following options.&lt;BR /&gt;1. tS3List --&amp;gt; tRedshiftBulkExec &amp;nbsp; &amp;nbsp;: fails on delimiter not found. Still trying to identify why this issue is occurring. Copy is working.&lt;BR /&gt;- tRedshiftBulkExec requires me to reenter my S3 credentials. This is kind of crazy as I already have a S3 connection and it would be great it the component can just use an existing connection.&lt;BR /&gt;2. tRedshiftRow &amp;nbsp; &amp;nbsp;: I enter the copy command here and it works. However this is not ideal when using an ETL tool. I can as well just use basic scripting. Ideally #1 should work.&lt;BR /&gt;3. S3 --&amp;gt; tS3Get (writes to local, ugh!!) --&amp;gt; tFileInput (from local) --&amp;gt; tRedshiftOutput &amp;nbsp; &amp;nbsp;(defeats the purpose of S3 to Redshift).&lt;BR /&gt;What is the best way to accomplish this using Talend OS for Big Data ver 6.1.1?&amp;nbsp;&lt;/TD&gt;
   &lt;/TR&gt;
  &lt;/TBODY&gt;
 &lt;/TABLE&gt;
&lt;/BLOCKQUOTE&gt;</description>
      <pubDate>Fri, 17 Feb 2017 05:12:10 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/S3-to-Redshift-issues/m-p/2278008#M53602</guid>
      <dc:creator>_AnonymousUser</dc:creator>
      <dc:date>2017-02-17T05:12:10Z</dc:date>
    </item>
  </channel>
</rss>

