Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi All,
I have one requirement that should get developed on Talend Real -Time Big Data Platform.(i.e our product version name )
So, I'll be sharing the scenario with you. Please assist me.
Scenario : I'll be getting multiple file feeds from different source. Some files will have header, body or details and trailer data, whereas other file feeds will have 2 separate file one for header and trailer data and on other it have body or details of data.
Every file feeds have different schema (i.e different column name).
For eg: 5 Files have 5 different column name.
So i have to lookup those column name or schema with correspond to respective file.
Suppose whenever any XYZ files come it have to do lookup on the schema and then have to fetch that file and then the ETL process will run.
Example
file name XYZ.txt
ID|Name| -- Header
1 | abc | -- Body
2 | efg | -- Body
total count 2 -- Trailer
Stored schema with datatype
ID
Name
Note: Storing schema should be an one time activity
So, the respective file feeds have to do an lookup with schema and should fetch only "XYZ.txt" file only, not other file.
I hope you have understand mine, requirement.
If it;s please suggest or guide me in this requirement.
Regards,
Rohitash Sherigar
(M: +91-9594733034)
I assume you want to fetch the files from remote location using ftp/sftp/Smb/http. In this case You cannot fetch the file based on its Data.
You have one option to start reading each file in steam mode and stop the stream if header does not match the requirement.
https://help.talend.com/reader/KxVIhxtXBBFymmkkWJ~O4Q/Vr3wYVdxbfcjeba~qekWDw
Let me know if my understanding is not correct.
No , It's not like that. I think i have used the wrong terminologies to explain.
File will already be there at remote or local directory.
As I have already mentioned in my previous post. That inside one directory there may be around 100's of files will be there and each file have different attributes.
Ex : File 1 File 2 File 3 File n
ID , Name Dept, Dept_Name CustID,Cust_Name ......... So on
1, ABC 2, EFG 1, XYZ
Now, I'm using only 1 tfilelist and 1 tfileinputdelimited component where current file directory is same but file's are different.
By using global expression file name get changed , at the same time in the edit schema section should also get changed.
Like if File 1 then schema should be ID, Name
File 2 then schema should be Dept, Dept_Name
File 3 then schema should be CustID,Cust_Name
... so on.
Then after that split process will get start where header and trailer record will get tagged.
I hope you have understand my requirement.
if it's still not, then please share your contact details.
Will discuss with each other about requirement.
Regards,
Rohitash Sherigar