Skip to main content
Announcements
Global Transformation Awards submissions are open! SUBMIT YOUR STORY
cancel
Showing results for 
Search instead for 
Did you mean: 
TC1692777606
Contributor
Contributor

Dimension Loading for multiple Files

In a Talend job designed to load data from multiple files, where schema validation is crucial, files with non-matching schemas are not being handled correctly. Specifically, when there are files with schemas that do not match the reference schema (emp1_csv), only one of these files is moved to the "Rejected Folder," while others are not processed, causing them to remain in the source folder without any further action. The goal is to ensure that all files, including those with non-matching schemas, are correctly processed and handled according to their schema compatibility. Below are the screenshots attached of Job Creation for the same.

Labels (4)
2 Replies
anselmopeixoto
Partner - Creator III
Partner - Creator III

Hello @Trupti C​ 

 

The cause of this behavior is that you enabled both "Check each row structure against schema" and "Die on error" options on tFileInputDelimited.

 

The "die on error" option will cause an exception on the Job and will stop its execution when the first row that doesn't match the schema is found.

 

However, if you disable the "die on error" option, only the non matching rows of each file would be rejected. I understand this is not what you expect. Instead, you need to reject the whole file is that correct?

 

If that's the case, I would try the following:

 

  1. Read the input file at each iteration with the "die on error" disabled
  2. send the "good" input data to a tHashOutput and the rejected rows to a tJavaRow
  3. use a If trigger from tFileInputDelimited or from tJavaRow to identify whether there were rejected rows
  4. If rejected rows > 0 move the file to the "Rejected Folder" and clear the temporary data stored using tHashOuput (just use a tHashInput connected to a dummy copmponent like tJavaRow to read the stored data and clear it)
  5. If rejected rows == 0 read the data stored in memory using tHashInput and send it to tDBSCD
TC1692777606
Contributor
Contributor
Author

Thank you, But since I am a beginner in Talend I am not able to get a clear picture of how to do it, Like I can't connect Row-> Main Link to tHashoutput from tfileDelimited So can you please send a sketch or rough picture of the job for same