Skip to main content
Announcements
Introducing a new Enhanced File Management feature in Qlik Cloud! GET THE DETAILS!
cancel
Showing results for 
Search instead for 
Did you mean: 
Rohitash
Contributor
Contributor

Dynamic file Fetch according to it's schema.

Hi All,

 

I have one requirement that should get developed on Talend Real -Time Big Data Platform.(i.e our product version name )

 

So, I'll be sharing the scenario with you. Please assist me.

 

Scenario : I'll be getting multiple file feeds from different source. Some files will have header, body or details  and trailer data, whereas other file feeds will have 2 separate file  one for header and trailer data and on other it have body or details of data.

Every file feeds have different schema (i.e different column name).

For eg: 5 Files have 5 different column name.

 

So i have to lookup those column name or schema  with correspond to respective file.

 

Suppose whenever any XYZ files come it have to do lookup on the schema and then have to fetch that file and then the ETL process will run.

 

Example 

 

file name XYZ.txt                   

 

ID|Name|     -- Header     

1 | abc |       -- Body

2 | efg |        -- Body

total count 2 -- Trailer

 

Stored schema with datatype

 

ID

Name

 

Note: Storing schema should be an one time activity

 

So, the respective file feeds have to do an lookup with schema and should fetch only "XYZ.txt" file only, not other file.

 

I hope you have understand mine, requirement.

If it;s please suggest or guide me in this requirement.

 

Regards,

Rohitash Sherigar

(M: +91-9594733034)

Labels (3)
3 Replies
akumar2301
Specialist II
Specialist II

I assume you want to fetch the files from remote location using ftp/sftp/Smb/http. In this case You cannot fetch the file based on its Data.

 

You have one option to start reading each file in steam mode and stop the stream if header does not match the requirement.

 

https://help.talend.com/reader/KxVIhxtXBBFymmkkWJ~O4Q/Vr3wYVdxbfcjeba~qekWDw

 

Let me know if my understanding is not correct.

Rohitash
Contributor
Contributor
Author

No , It's not like that. I think i have used the wrong terminologies to explain.

 

File will already be there at remote or local directory.

As I have already mentioned in my previous post. That inside one directory there may be around 100's of files will be there and each file have different attributes.

 

Ex : File 1             File 2                        File 3                                    File n 

       ID , Name      Dept, Dept_Name    CustID,Cust_Name    ......... So on

         1,  ABC             2, EFG                         1, XYZ     

 

Now, I'm using only 1 tfilelist and 1 tfileinputdelimited component where current file directory is same but file's are different.

By using global expression file name get changed , at the same time in the edit schema section should also get changed.

 

Like if  File 1 then schema should be ID, Name

           File 2 then schema should be Dept, Dept_Name

           File 3 then schema should be CustID,Cust_Name

           ... so on.

 

Then after that split process will get start where header and trailer record will get tagged.

 

I hope you have understand my requirement.

 

if it's still not, then please share your contact details.

Will discuss with each other about requirement.

 

Regards,

Rohitash Sherigar

akumar2301
Specialist II
Specialist II

you cannot change schema dynamically in a component but you can use dynamic schema type. Assuming, you donot have lot of transformation this would a good option.



Otherwise, if you have separte subjob for each schema type and you need to read input header and identify which file will go which subjob that is also possible . Let us know.