Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Join us to spark ideas for how to put the latest capabilities into action. Register here!
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

Hi

Hi Everyone,

 

i am pulling all the files from S3 and loading the data inside the files into database tables.

I have another table in oracle which has all the names of the files (for auditing purpose) that i am pulling from S3 and loading into database.

Suppose someone has uploaded a new file in S3 that has no record in the database.

now I need to check if the name of files that i am pulling from S3 is already there in database or if it is a new file.

Do you know how this can be achieved?

I tried to join condition in tmap with toracle input (taking names of all records from DB)

but the links are not connecting from my previous job which is pulling files from S3.

Do you know any other method to do that?

 

Regards,

Mohit

 

Labels (2)
1 Solution

Accepted Solutions
Anonymous
Not applicable
Author

Hi Mohit,

 

     Below logic will help you to identify whether a file from S3 is a new file or not. I have created just skeleton flow and you need to expand it based on your requirement.

0683p000009LzVD.png

 

 

In above flow, we are first fetching the file name list from S3 and store it in a Hash Output component. Once all the files are stored, we are reading this Hash component using a tHashInput (don't forget to select clear cache after reading option) and then do an inner join with Oracle table. All the existing files will be present in the main flow of tmap and all the new files ill be going to reject flow of tmap (since no matching records are available in Oracle DB).

 

     The main trick is in trowgenerator where we will be picking each file name only once (by mentioning the number of records to be generated as 1). The screenshots of main components are as below.

0683p000009M1Fo.pngtrowgenerator

 

0683p000009M1Ft.pngtmap

 

 

I hope my answer has helped to clear your query. Could you please mark the topic as resolved so that it will help the Talend community? Kudos are also welcome 🙂

 

Warm Regards,

 

Nikhil Thampi

View solution in original post

1 Reply
Anonymous
Not applicable
Author

Hi Mohit,

 

     Below logic will help you to identify whether a file from S3 is a new file or not. I have created just skeleton flow and you need to expand it based on your requirement.

0683p000009LzVD.png

 

 

In above flow, we are first fetching the file name list from S3 and store it in a Hash Output component. Once all the files are stored, we are reading this Hash component using a tHashInput (don't forget to select clear cache after reading option) and then do an inner join with Oracle table. All the existing files will be present in the main flow of tmap and all the new files ill be going to reject flow of tmap (since no matching records are available in Oracle DB).

 

     The main trick is in trowgenerator where we will be picking each file name only once (by mentioning the number of records to be generated as 1). The screenshots of main components are as below.

0683p000009M1Fo.pngtrowgenerator

 

0683p000009M1Ft.pngtmap

 

 

I hope my answer has helped to clear your query. Could you please mark the topic as resolved so that it will help the Talend community? Kudos are also welcome 🙂

 

Warm Regards,

 

Nikhil Thampi