
Contributor III
2015-08-31
10:32 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Find out if a ZIP / GZ file is corrupt
Hi there,
I have this very "funny" scenario where I receive .GZ files from a partner system - but some of the files are corrupt.
So I need to have a process which checks the integrity of the file - and if it is corrupt, skip it.
The process may not abort, though, which is the default behavior of Talend.
So how can I make sure I only process "good" files and skip "bad" ones?
Thanks
Matt
593 Views
6 Replies

Creator
2015-08-31
12:07 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I expected the 'check integrity' to skip bad files and proceed gracefully. As alternative, how about using a tSystem component and executing the unzip command per file. That way you can also capture info about what files returned errors.
593 Views

Contributor III
2015-08-31
12:21 PM
Author
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
That's what I expected, too. Unfortunately, it does not act that way. Instead, it crashes as if the checkmark was not set. Well "crashing" is not the right word: it throws an exception and jumps out of the iterator. So it does not continue after detecting a bad file.
Plus, the integrity check does not even work properly (e.g. it does not recognize incomplete files; for instance files that have been properly compressed but were only transmitted to a target system incomplete)
Plus, the integrity check does not even work properly (e.g. it does not recognize incomplete files; for instance files that have been properly compressed but were only transmitted to a target system incomplete)
593 Views

Creator
2015-08-31
12:29 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I had faced a similar issue in the past... Agree with you. I've created a Jira ticket to have this looked at and hopefully resolved soon -
https://jira.talendforge.org/browse/TDI-33802
593 Views

Contributor III
2015-08-31
12:37 PM
Author
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank you willm
Let's hope Talend does something about it soon.
In TOS DI 6.0, the issue is still there, though
Let's hope Talend does something about it soon.
In TOS DI 6.0, the issue is still there, though
593 Views

Anonymous
Not applicable
2015-08-31
01:13 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for reporting this issue in our bugtracker.
593 Views

Anonymous
Not applicable
2015-09-01
04:41 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
hi all,
for job that I expect 'normal' failure - corrupted zip file are a good example
- I put them in a 'child' job and uncheck the 'Die on child error' option.
The father job manage loop and/or iteration and children jobs only the threatment.
Could be a workaround ...
hope it help
regards
laurent
for job that I expect 'normal' failure - corrupted zip file are a good example
The father job manage loop and/or iteration and children jobs only the threatment.
Could be a workaround ...
hope it help
regards
laurent
593 Views
