tFileInputDelimited missing rows?

Anonymous · ‎2013-09-19

Hi
I have a subjob that processes some data and outputs it to a CSV using tOutputFileDelimited (1167 rows). I need to pull this into a tFileInputDelimited further down the flow, but every time it imported 0 rows.
So I split the subjob so that the tFileInputDelimited was the first component in a second subjob that was triggered on the first job completing, however now it's only pulling in 57 rows!
If I run the second subjob independently, it pulls in every row from the file, is there any reason it won't do this from onSubjobOk?
The job is essentialy like this:

tMap --> tFileOutputDelimited
        ¦
        v
onSubjobOk
        ¦
        v
tFileInputDelimited --> tMap

Any help would be appreciated as the output file is working fine and I can't under stand why the input won't read all the data rows unless the subjob is ran independently! I have attached images of the input and output settings...

Anonymous · ‎2013-09-20

You may be running into a timing issue where the output file has not yet been flushed to the drive and closed. If you want to keep the current design, put in something like a 30 second tSleep to give the data time enough to flush to disk.
However, a better solution is to not use an external file at all. For this very small number of rows, take a look at tHashOutput and tHashInput. This is an in-memory hash table that will give better performance, operates in real-time, and does not rely on an external drive system to function.
Be aware of things like memory consumption when using the tHash components. I have jobs that cache several hundred thousand rows of data and they are just fine. Your mileage may vary, depending on your environment.

Anonymous · ‎2013-09-20

hi all,
can we have a screenshoot of your job ?
regards
laurent

Anonymous · ‎2013-09-23

Hi wrlawton, I tried this with the tsleep component but only got 2 extra rows of data. I don't seem to be able to find thashinput or output?

Anonymous · ‎2013-09-23

I don't think is is a timing or flushing issue.
As kzone says, post a screenshot of your job

Talend Data Integration

v5.x