Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hello Community,
I have a scenario where I have to read a multi schema file and store that in multiple output files. The number of output files would be dependent upon the number of schema inside the input file. I have tried this using tFileInputMSDelimited. However, I could not find a way out to write the output to multiple files. Attached is the Input File for reference.
Any way to handle such situation?
Thanks in advance!
Best Regards,
Dipanjan
sending you a sample job. Change the Path accordingly.
Idea
- Get start and end line number of each schema
- Iterate through it and read input file with different header and Limit.
There might be other ways to do it.
Many Thanks for the prompt response!
Although I was able to achieve the column names with tFileInputMSDelimited as well. I tried your way, however, the output file only has the column names and not the column values corresponding to each of the schema. Also, could you please explain little bit more about your thought process regarding the use of 2nd tMap or rather more about its use case specifically the Lastlinenumber?
Below is the screenshot from the output files -
*P.S. - I have removed the Non Alphanumeric Characters from the input file (Attached for reference). Earlier I have given such characters just to separate out each of the 3 schema for better understanding.
Please open it in wordpad or Notepad++ or excel. may be because of OutputPositional Component(Not sure)
The above code is to get number of records to each schema.
Startlinenumber|NumberofLinetoread
8|-1
4|4
0|4
For 1st schema , head is 0 and Limit is 4 ( to read only 4 records)
For 2nd schema , head should be 4 and Limit again 4 ( To read number of record in that schema)
For last schema , head should be 8 and Limit -1 ( to read rest of file).
How did you implement it using tFileInputMSDelimited?
@uganesh ,
In tFileInputMSDelimited, I had to manually define the column names for each of the schema inside "Fetch Codes" section. But I was unable to get the column values.
I'm still not clear about the below code,
"For 1st schema , head is 0 and Limit is 4 ( to read only 4 records)
For 2nd schema , head should be 4 and Limit again 4 ( To read number of record in that schema)
For last schema , head should be 8 and Limit -1 ( to read rest of file).
"
Furthermore, in this case, each of the schema has fixed set of rows. If there comes a scenario where I need to handle somewhat like below -
Schema 1 has 3 rows
Schema 2 has 5 rows
Schema 3 has 9 rows
Will I be able to achieve this using the same use case?
Yes it should work. Only condition is Schema should start with "Name" String.
Attaching another job with simpler approach. Change input/Output directory
@uganesh , Why have you mentioned Field Separator as "XYZ" ?