Hi,
I have a Excel file. I need to validate the excel file against the schema for each column and have to find out if any column header is missing. Also need to store the erroneous rows / missing column in a log file.
Please suggest how to do this?
Thanks for your response. My job structure is like this tfileInputExcel -> tSchemaComplianceCheck -> tlogRow Excel file has some date columns. In tfileInputExcel, the datatype is defined as Date for those columns. So when some rows have invalid date, it is discarded by tfileInputexcel component itself and the row does not come to tSchemaComplianceCheck component. In that case how can i log this invalid data to log file? Should i make all columns as String in tfileInputExcel schema or is there any other way to handle this?
Yes; if you suspect invalid data you must read it as a String before using tSchemaComplianceCheck (in the "Use another schema" mode). You can follow that with a tConvertType to change the data types for the valid data.
Note that you can also get Rejects rows from tFileInputExcel and tConvertType; you don't necessarily have to use tSchemaComplianceCheck.