Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hello !
I'd like to remove all files that have the same content but keep one.
The final result should be files with all different content.
My file name format is fileName_timestamp.csv
For exemple :
My directory looks like this :
- fileName_t1m3st4mp.csv
- fileName_0th3rt1m3st4mp.csv
- fileName_4n0th3rt1m3st4mp.csv
Content in my files looks like this :
fileName_t1m3st4mp.csv
This is a content
fileName_0th3rt1m3st4mp.csv
This is a content
fileName_4n0th3rt1m3st4mp.csv
This is a different content
When i run the job :
fileName_0th3rt1m3st4mp.csv should be deleted
Now my directory should only have :
- fileName_0th3rt1m3st4mp.csv
- fileName_4n0th3rt1m3st4mp.csv
using Talend ESB 7
If you have any suggestion, please do !
Thanks !
Try with tFileList , tMemoriseRow tFileCompare and tFileDelete .
Not sure if these are part of ESB
Thanks for your response !
Those components are indeed in ESB.
I need to compare each files with all the others, i'm not sure how i can do that with a FileCompare component since it only allow 1 input.
Can you guide me through your thinking ?
Best regards,
Here you're mainly checking the file name not the actual content.
I think i found something. I can log content and filename independently but can't find a way the get both of them at the same time.
My goal here is get a output that contains all the file names and file content. (fileName;fileContent)
I guess i'll be able to use a tUniqRow to check duplicate content once i've figured out this.....
it worked . Removed duplicate files. Try once.