How does output stream work?

Anonymous · ‎2014-03-05

Hi, I would like to have a job that uses the output from a tFileOutputDelimited as the input for a tFileInputDelimited
I'd like to do this without actually writing to a physical file
Is there a default command to use for the output stream to get the data from this component?
For example, my tFileInputDelimited component uses a tFileFetch as its source and the File name/Stream field looks like this:
((java.io.InputStream)globalMap.get("tFileFetch_1_INPUT_STREAM"))
Can I use something similar, e.g.
((java.io.InputStream)globalMap.get("tFileOutputDelimited_1_INPUT_STREAM"))
?
I tried doing just this but I get the error message: outputStream cannot be resolved to a variable
Thanks

willm1 · ‎2014-03-05

Streaming is faster but still creates a physical file.
A work-around to using files on disk is using the tBufferInput and tBufferOutput. You can read up about them here
https://help.talend.com/search/all?query=tBufferInput&content-lang=en
If you don't find these components in your palette in your Studio, you'd need to add them by going to File --> Edit Project properties --> Designer --> Palette Settings, and adding to your palette.

Anonymous · ‎2014-03-07

Hi and thanks
Do you know if I can use the stream of a tBufferInput into a tFileInputDelimited?
I have to define my schema mid-job with some special delimiting and line break characters (it's just a single column string before that)
Seems I need to use a tFileInputDelimited to specify what the characters are from the incoming file

willm1 · ‎2014-03-07

Sounds like you need the tExtractDelimitedFields after your tBufferInput to split the single column into multiple columns...
See usage in this help file: https://help.talend.com/search/all?query=tExtractDelimitedFields&content-lang=en

Anonymous · ‎2014-03-07

Hi, I had looked at tExtractDelimitedFields, but I am also using a special characters to denote line breaks so I didn't know how to proceed

willm1 · ‎2014-03-07

See attached screenshots for what the job could look like....

willm1 · ‎2014-03-07

Presuming you start by reading a file (like I did in my screenshots above), you'd specify what special characters denote line breaks. In the attached screenshot, it's "\n". Change it to yours...

Anonymous · ‎2014-03-07

Thanks for sharing the screen caps
The issue is that I basically have a delimited file within a delimited file

My first pass I get the normal delimited fields; second pass I get the embedded file that was within one of the columns
What it seems to amount to is that I need to define my line breaks mid-job, which is why I asked if I could possibly stream the tBufferInput to a tFileInputDelimited file

willm1 · ‎2014-03-07

Sample layout of your file?

Anonymous · ‎2014-03-07

Here it is:

ga:eventLabel,ga:totalEvents
typeA|111111|x;typeA|111112|x;typeB|111113|x;typeB|111114|x,20
typeA|111115|x;typeA|111116|x;typeB|111117|x;typeB|111118|x,32

The first column has "|" as delimiters and ";" as line breaks
In step one of my job I drop the second column and do a string replace to replace the "x" with the value of column 2, "ga:totalEvents"
So then I have one column with everything in it, e.g.

typeA|111111|x;typeA|111112|x;typeB|111113|x;typeB|111114|20
typeA|111115|x;typeA|111116|x;typeB|111117|x;typeB|111118|32

This works fine if I then output to a file with a one column schema and then input the same file with a 3 column schema, but of course I'm trying to avoid writing to physical files

Java

Talend Data Integration

v5.x