Hi,
I have a workflow where I am pushing data from S3 csv file in Redshift table. I am using tRedshiftRow component to load data into Redshift table.
In Advanced Settings of tRedshiftRow, I have set commit to every 10000 records.
I had some scenarios where I was loading 1 million records into Redshift, but it failed because of some bad data. The job failed after copying .9 million records.
But in Redshift I do not see any data in the table.
So ideally, if commit is working i should have around 0.9million records. But it is not there.
So does commit work in this scenario??
Hi Any one colud help me , i am loading 100GB file from my local system to S3 then Redshift,my job flow is tS3Connection---->tS3Put---->tRedshiftrow.but i am getting below error, i understood the error but i saw solution in few blogs that
your praposed upload exceeds the maximum allowed size.
my ideas to resolve this issue are: 1. Any configuration details need to change for extending limit. 2. Multipart upload in talend. 3. CLI Component for file to s3 and S3put component for S3 to redshift. please am i thinking right? any one give detail approach. Thank you Jabi Shaik
Hi, Is there any error message printed on console during the processing? Could you please upload your job design screenshots into forum? Best regards Sabrina
Hi,
Any processing row is showing on your job flow? Could you please upload your job design screenshots into forum so that we can get your current situation more precisely.
Best regards
Sabrina
Hi,
I am afraid but I do not have that job flow currently.
It was like
s3Put -> tRedshiftRow
On the connector it showed number of rows in some seconds and number of rows/sec.
When it threw an error (because of bad data), it showed around 0.9 million rows in some seconds and number of rows/sec.
So i expected that, as it failed after 0.9 million rows, atleast those rows should have been present in Redshift as i set commit after 10000 rows. But when i checked Reshift, to my surprise, table was empty.
When i cleaned the data, and it processed entire 1 million records again, i checked in Redshift and it was all there.
So i guess it only commits to DB after the entire process and not after 10000 records.
Thanks,
Neil Shah
Hey have you check stl_load_errors tabl in redshift ,might be some column value is going to some other column while coping from s3 bucket to redshift. Thanks Ashish
As I mentioned earlier, the problem is not with error thrown. I have resolved that. The problem is with "commit" of data. So no point in checking stl_load_errors table
Hi Any one colud help me , i am loading 100GB file from my local system to S3 then Redshift,my job flow is tS3Connection---->tS3Put---->tRedshiftrow.but i am getting below error, i understood the error but i saw solution in few blogs that
your praposed upload exceeds the maximum allowed size.
my ideas to resolve this issue are: 1. Any configuration details need to change for extending limit. 2. Multipart upload in talend. 3. CLI Component for file to s3 and S3put component for S3 to redshift. please am i thinking right? any one give detail approach. Thank you Jabi Shaik