Skip to main content
Announcements
A fresh, new look for the Data Integration & Quality forums and navigation! Read more about what's changed.
cancel
Showing results for 
Search instead for 
Did you mean: 
Alpha549
Creator II

tSchemaComplianceCheck : type check doesn't work ?

Hello everyone,

I'm discovering tSchemaComplianceCheck with a simple test job :

0695b00000N1yh0AAB.png

Here is the tFileInputDelimited schema :

0695b00000N1ykNAAR.png

tSchemaComplianceCheck :

0695b00000N1ykcAAB.png

0695b00000N1yl1AAB.png

The input delimited file :

0695b00000N1ylfAAB.png

However, here is what I get when I run the job :

0695b00000N1ymOAAR.png

I was expecting to find my D2 line in the tLogRow2 as COL3 is not an Integer.

Or I didn't get how tSchemaComplianceCheck component works ?

Thank you in advance 🙂

Labels (2)
1 Solution

Accepted Solutions
Anonymous
Not applicable

This can be a bit confusing to start with. The problem you have here is that you are setting the class of the Integer column at the tFileInputDelimited point. This component will read the data in (flat file data is essentially a String) and will convert (or try to convert) the data to String, String and Integer columns. If the conversion is not possible, it will fail at this point. As you can see from the error at the top of your log, this column is unable to be converted by the tFileInputDelimited component, therefore it fails the row.

 

To test the tSchemaComplianceCheck, you would need to read everything in your flat file in as a String. The assumption is that at this point, you may not know that the content of the file is correct. There are often errors. So you read it in as Strings, then you use the tSchemaComplianceCheck to ensure that the data meets the expected schema before you then convert it.

 

So, for the following data....

 

col1; col2; col3

aa; bb; 1

cd; hj; 3

df; gh; t

 

....col1 is a String, col2 is a String and col3 is meant to be an Integer. However ALL data is retrieved as Strings (the safest data type for files). Then we would connect it to a tSchemaComplianceCheck component which is expecting col1 to be a String, col2 to be a String and col3 to be an Integer. If we connect the tSchemaComplianceCheck to two tLogRows (as you have done), we would see the following result....

 

tLogRow1

aa|bb|1

cd|hj|3

 

tLogRow2

df|gh|t|2|newColumn2:wrong type

 

 

 

 

 

View solution in original post

2 Replies
Anonymous
Not applicable

This can be a bit confusing to start with. The problem you have here is that you are setting the class of the Integer column at the tFileInputDelimited point. This component will read the data in (flat file data is essentially a String) and will convert (or try to convert) the data to String, String and Integer columns. If the conversion is not possible, it will fail at this point. As you can see from the error at the top of your log, this column is unable to be converted by the tFileInputDelimited component, therefore it fails the row.

 

To test the tSchemaComplianceCheck, you would need to read everything in your flat file in as a String. The assumption is that at this point, you may not know that the content of the file is correct. There are often errors. So you read it in as Strings, then you use the tSchemaComplianceCheck to ensure that the data meets the expected schema before you then convert it.

 

So, for the following data....

 

col1; col2; col3

aa; bb; 1

cd; hj; 3

df; gh; t

 

....col1 is a String, col2 is a String and col3 is meant to be an Integer. However ALL data is retrieved as Strings (the safest data type for files). Then we would connect it to a tSchemaComplianceCheck component which is expecting col1 to be a String, col2 to be a String and col3 to be an Integer. If we connect the tSchemaComplianceCheck to two tLogRows (as you have done), we would see the following result....

 

tLogRow1

aa|bb|1

cd|hj|3

 

tLogRow2

df|gh|t|2|newColumn2:wrong type

 

 

 

 

 

Alpha549
Creator II
Author

Thank you rhall I understood and everything is ok !

In this case, right after the tSchemaComplianceCheck it's necessary to use a tConvertType to convert everything from String to the real types we want.