Hi,
first thanks for the quick response, i'll try to make the difficulties i'm facing a bit more clear.
So there is NO header row in any input file. The only source to get the header information from is to read the right excel spreadsheet acording to the table name (which is the same like the input file name and handled to the job via context variable) and extract the column names from the "ColumnName" column. But as far as I tested and read it in the documentation, reading data into a dynamic schema has a bit "special" behavior. Using a dynamic schema with tFileInputDelimited will
always read a header row "whether the
Header value is set to 0 or to 1". So while reading the file the first data row is misinterpreted as header row.
I tried two approaches:
1. If I change the metadata.columnNames in a Java component, I lose the information of the first row. So I have to insert the header before reading the the file which than means creating a new temporary file, add the csv formatted header, add the data and then read this file in another subjob with dynamic schema. This is working, but writing a 8GB temporary file just to add one column at the beginning and than writing another 8GB output file is not very efficient.
2. Creating the output file with the header row and then append all data processed by the processing subjob. At first it looks quite ok, but as the first row is not used as data but as header row, this is not processed and transformed in the tJavaFlex component which is transforming the data.
So I guess I have two options, somehow adding the header row to the data before reading it and processing it with Talends dynamic schema without writing a whole temporary file. At the moment I read in the file as shown in the picture below.
Or choosing a predefined schema (a File delimited schema, generic schema or whatever appropriate) dynamically at runtime. But as far as i digged into this "dynamic schema" topic I think it is not possible to select a schema via a variable (see picture)
I'm quite new to Talend and I thought it's not a big deal to read the schema of a input file at runtime, but I'm slowly running out of ideas.
Thanks,
Thomas