Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Qlik Open Lakehouse is Now Generally Available! Discover the key highlights and partner resources here.
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

Doing transformation for Data from S3 On the Fly

Hi Expert, 

Right now I have a case which I need to do some transformation on the fly, or done it through the ETL PROCESS, for the data which is from S3. 

Could you please give me the solution or an idea which component I should use?
Because currently I think the only possible way is, get the data into local first, and use the local flat file as the source, make the transformation on it, and put it back as a clean data in S3 as a one single file.
Anyway at the end, these data will be inserted to AWS RS. 
(I think the given samples in Talend, are using tRedshift Bulk Exec, and load the whole data in a csv file from S3 to a table. In my case, I need to do some transformation first before pump it to RS.)

Thanks in advance

Labels (2)
1 Reply
Anonymous
Not applicable
Author

S3 is not a local file system. Rather it is accessible via REST, SOAP or BitTorrent (See https://en.wikipedia.org/wiki/Amazon_S3). Thus, no matter which approach you use to work with S3 you will either explicitly or implicitly have to copy the file locally, then process it, then upload it again. If the files are small enough you can deal with it in memory.
Thus for above, use a tS3Get, assuming CSV data tFileInputDelimited, then add your processing components, then you can use the tRedshiftBulkExec with the prepared file.

Thomas