Hi,
I have a problem : I need to count the columns of a csv file because, when I check it the tSchemaCompliance component, I got no error.
I found the origin of the problem : some "\t" where inserted in some fields before the data were exported out of the the DB. Now, I must re-import the data (after an update made by an other company).
The tSchemaCompliance didn't check if some columns didn't exist for some rows. Is there an option to check that?
I try with a Metadata file or with an integrated file : same result : no error, because the column A is in the B place and A,B have the same long type or if the 2 columns are empty (or one of the 2 is empty).
How can I find the rows where the B column didn't exist (or when I got 35 columns instead of 36 columns)?
The problem can also be : I got too much columns (wiht the \t inserted in some file and I didn't check if people add \t in a field (by copy/paste from an other web application for example : I just get the csv file).
Sorry for my bad english,
Thanks for your help,
The tFileInputDelimited component has an Advanced setting "Check each row structure against schema" that you can combine with a rejects flow to identify records with extra or missing fields.
Thanks, it's seems ok. A little bit hard because 60000 rows going wrong and 3 are ok 😉 after creating the schema with the wizard.... I will investigate.
Hi Alevy
I am new to talend and I have faced similar issue as yours. I have table with columns name,addr1,addr2,company and am passing data in csv as
name,addr1,addr2,company
q3,sdkjh,ad2
q5,sdkjh,ad2,c1,c2,c5
q8,sdkjh,ad2,c1
and the third row is getting inserted correctly after i checked "Check each row structure against schema". But when i connected the reject flow to a tFileOutputDelimited I am getting as
q3,sdkjh,ad2,,,Column(s) missing - Line: 0
q5,sdkjh,ad2,c1,,Too many columns - Line: 1
It is cutting down the data in the extra rows and putting blank values instead ... I want the rejected data file to look like
q3,sdkjh,ad2Column(s) missing - Line: 0
q5,sdkjh,ad2,c1,c2,c5Too many columns - Line: 1
Is there any way to get in the above fashion? If not using "Check each row structure against schema" can we use any other so that I can filter out the ones with the correct data into one csv and the wrong ones into another csv.
Thanking you
Nishanth
Hi,
I have a similar query, where I need to check if my csv file has got how many columns failing to which raise a reject. By checking "check each row structure against schema" in tfileinputdelimited it is not working. As in it will put into reject file however not with the proper error message like - columns missing"
Is there any other solution for this ?
Thanks