Skip to main content
Announcements
Introducing Qlik Answers: A plug-and-play, Generative AI powered RAG solution. READ ALL ABOUT IT!
cancel
Showing results for 
Search instead for 
Did you mean: 
gopal16
Contributor
Contributor

Retrieve the selected files from S3 bucket and process those in the job directly

My scenario is that every day source files will come to different dynamic date folders in S3. I need to pickup the files after last processed timestamp and get those files to use in the main flow job. I am using ts3list component to list the files with the prefix(I can't give complete path as the folder are with dynamic dates). After that in ts3get component, i have to get only files which are newer than the last processed timestamp. But not much options are available. I am able to provide only ts3list current key in the key section. With this i am getting older processed files as well. Also once i get proper file, i don't want to store in the local and process directly in the job. Please help me to achieve this scenario.  Thanks!!

Labels (3)
19 Replies
Anonymous
Not applicable

Hi
I have seen the similar requirement open by other users, unfortunately, it is impossible to read the file directly on S3, you have to download the file to local system, then process it, and delete the file from local system after it is done if needed.

Regards
Shong
gopal16
Contributor
Contributor
Author

Okay. What about picking up files with particular prefix from dynamic dates folders in S3 instead?

Anonymous
Not applicable

Hi
Set the key prefix to the parent folder which will contains the dynamic date folders, it will list all the sub-folders and files, then you can filter the files based on certain condition, such as file extension "*.txt".

Regards
Shong
gopal16
Contributor
Contributor
Author

With below flow, I am able to list the files and select particular files based upon filter component. But, it's just for listing.. I am not able to process the file in the job.

 

tS3List -> tIterateToFlow -> tFilterRow -> tLogRow

 

I am not able to link to tS3Get from tFilterRow. If i link using onComponentOk from tFilterRow, then it's just picking the last file instead of all the required files. If i link tS3Get from tS3List, i am getting all the files instead of required files. 

 

Please help.

manodwhb
Champion II
Champion II

@gopal16, after tlogrow use the tjavarow and set the filename as context and use ts3get component from tjavarow using on component ok and you will the required files.

gopal16
Contributor
Contributor
Author

Not able to link from tJavaRow to tS3Get with Main connection. Only onComponentOk is allowed and with that able to get only one last file. Not all the required files.

manodwhb
Champion II
Champion II

@gopal16 , yes you need to use on component ok to connect. to tlogrow,how many files it was getting?

gopal16
Contributor
Contributor
Author

To tLogRow i am gettting 10+ records with filenames

manodwhb
Champion II
Champion II

@gopal16 ,if that is the you will get all the files with the desgin of what i told. please verify.