Solved: [resolved] Commit while copy S3 data to Redshift - Qlik Community

Anonymous · ‎2014-09-29

Hi,
I have a workflow where I am pushing data from S3 csv file in Redshift table. I am using tRedshiftRow component to load data into Redshift table.
In Advanced Settings of tRedshiftRow, I have set commit to every 10000 records.
I had some scenarios where I was loading 1 million records into Redshift, but it failed because of some bad data. The job failed after copying .9 million records.
But in Redshift I do not see any data in the table.
So ideally, if commit is working i should have around 0.9million records. But it is not there.
So does commit work in this scenario??

Anonymous · ‎2015-05-12

Hi Any one colud help me , i am loading 100GB file from my local system to S3 then Redshift,my job flow is
tS3Connection---->tS3Put---->tRedshiftrow.but i am getting below error, i understood the error but i saw solution in few blogs that

your praposed upload exceeds the maximum allowed size.

my ideas to resolve this issue are:
1. Any configuration details need to change for extending limit.
2. Multipart upload in talend.
3. CLI Component for file to s3 and S3put component for S3 to redshift.
please am i thinking right? any one give detail approach.
Thank you
Jabi Shaik

View solution in original post

Anonymous · ‎2014-10-13

Hi,
Is there any error message printed on console during the processing? Could you please upload your job design screenshots into forum?
Best regards
Sabrina

Anonymous · ‎2014-10-13

Hi,
There is no issues with "Error appearing". My doubt is does commit work in this scenario?
Thanks,
Neil Shah

Anonymous · ‎2014-10-13

Hi,
Any processing row is showing on your job flow? Could you please upload your job design screenshots into forum so that we can get your current situation more precisely.
Best regards
Sabrina

Anonymous · ‎2014-10-13

Hi,
I am afraid but I do not have that job flow currently.
It was like
s3Put -> tRedshiftRow
On the connector it showed number of rows in some seconds and number of rows/sec.
When it threw an error (because of bad data), it showed around 0.9 million rows in some seconds and number of rows/sec.
So i expected that, as it failed after 0.9 million rows, atleast those rows should have been present in Redshift as i set commit after 10000 rows. But when i checked Reshift, to my surprise, table was empty.
When i cleaned the data, and it processed entire 1 million records again, i checked in Redshift and it was all there.
So i guess it only commits to DB after the entire process and not after 10000 records.
Thanks,
Neil Shah

Anonymous · ‎2014-10-13

Hey have you check stl_load_errors tabl in redshift ,might be some column value is going to some other column while coping from s3 bucket to redshift.
Thanks
Ashish

Anonymous · ‎2014-10-13

As I mentioned earlier, the problem is not with error thrown. I have resolved that. The problem is with "commit" of data. So no point in checking stl_load_errors table

Anonymous · ‎2015-05-12

Hi Any one colud help me , i am loading 100GB file from my local system to S3 then Redshift,my job flow is
tS3Connection---->tS3Put---->tRedshiftrow.but i am getting below error, i understood the error but i saw solution in few blogs that

your praposed upload exceeds the maximum allowed size.

my ideas to resolve this issue are:
1. Any configuration details need to change for extending limit.
2. Multipart upload in talend.
3. CLI Component for file to s3 and S3put component for S3 to redshift.
please am i thinking right? any one give detail approach.
Thank you
Jabi Shaik

[resolved] Commit while copy S3 data to Redshift

Big Data

Other

v5.x