Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Join us in Bucharest on Sept 18th for Qlik's AI Reality Tour! Register Now
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

Problem with processing huge XML file with tFileInputDelimited

Hi, I have made a really simple job to remove the header (4 lines) and the last line of a really big xml document (more than 100 Go) encoded in ISO-8859-1.
This is really simple : I use the tFileInputDelimited to read the document line by line and remove the 4 lines header.
Then the tReplace is used to remove the last tag (<\IproClassDatabse>) (didn't find any other solution for such a big file).
But when the job is done the new file (without header and last line) have half less lines than the original (it should have 5 lines less) !
By using the tail command I can see that the new xml document doesn't end as the original xml document. The job seems to have stopped to process the document.
I have tried this job with smaller xml document and there is no error...

This is a really really simple job, so I really don't get where is the problem. Even if the xml document is really big (120Go) it shouldn't be a problem, it just take some times to be done.
Anyone already met a similar problem or have an idea where the problem comes from ?
Screenshot of the job :
http://i.imgur.com/T8tOlkUh.png
http://i.imgur.com/FGc4vu4h.png
http://i.imgur.com/diYxizxh.png
Labels (3)
5 Replies
Anonymous
Not applicable
Author

Hi
To read a file line by line, I would suggest you to use tFileInputFullRow.
Shong
Anonymous
Not applicable
Author

Hi, thanks for the answer. But I have just tried it and it is the same problem : it stops at the same place.
Anonymous
Not applicable
Author

Hi
The job is really simple, and I don't see something wrong in the job settings, which version are you using? Does the job end normally without error?
Shong
Anonymous
Not applicable
Author

Hi,
I am using Talend Open Studio for ESB (5.3.0.r101800).
And the job ends normally, without error.
Anonymous
Not applicable
Author

Who knows if tBoostedFileInputXML component can handle that kind of files too....