[resolved] How To Validate Delimiters in Row Using tFileInputFullRow component
Hi
I am having files with different type of delimiters, i don`t know where and which file has delimiter issue, for that i am creating Job, which reads file in full row mode and then checking how many delimiters we have in a row, using delimiter count tjavaRow comp. and storing that delimiter count in a context variable to used in tMap to out put the row.
my job is look like this:
tFileList----TFileInputFullRow ---tJavaRow------------- tMap-----------tFileOutPutPositionall (Good Records)
|
|
tFileOutPutPositionall (Bad Records)
in tJavaRow code
context.vDelCount = line.Split(delimiter).length;
in tMap
I have created two out put one is for good and second one is for Bad using
context.vDelCount ==defaultCount then send to good
context.vDelCount !=defaultCount then send to bad
but some file delimiter is in " " quoted string i am able how to figured it out and resolve this issue. some times file has nested quoted string like below
" some |text " inner|text " "|xyz string|pqr string"|""
is there any way to put text qualifier and quoted string in tFileInputFullRow?
please suggest how to parse above row to validate the delimiter count as well nested quoted string issue.
sorry it has been missed some text. i mean to say, some delimiters comes in quoted string . In that case how to skip quoted delimiter, to validate delimiters count in a row. if there is no way to put text qualifier in tFileInputFullRow. then could you suggest me how to accomplish this task.
Than You Pedro, but i think this will not work because i want to check how many column are present in a row based on that i will process it further. could you tell me how to count, how many Pipe delimiters are there. count should not calculate quoted Pipe delimiters.
" some |text " |inner|text |"xyz string|pqr string"|""|dsadsadsa|sdasadsa| Above has three part of string see below 1-- " some |text " |inner|text |"xyz string|pqr string" 2-" |inner|text |" 3- |""|dsadsadsa|sdasadsa| second part is the sub part of first part. please suggest how to calculated in this case or in general case.
I suggest you try the
opencsv parser, which gives you a very flexible way to parse strings to arrays with variable delimiters, text enclosures and escape characters.
sorry we can not use any third party utilities, i have found the solution on google to skip the delimiter from quoted string but now i got new problem, i need to replace quote inside quote string. see below example INPUT String: 439.00 01/28/2012 65 39284 7218063470 "xyz string"" with pqr string" 78.00 0000505585 my test string 2.06 history be 29.00 OUTPUT Line: 439.00 01/28/2012 65 39284 7218063470 "xyz string with pqr string" 78.00 0000505585 my test string 2.06 history be 29.00 any suggestion.