LOGGING ROW WHEN ERROR OCCURRED

Anonymous · ‎2020-01-27

I asked this last week and my post was viewed and ignored so maybe I worded it oddly or this is physically impossible with Talend and no one want's to let me know. I have a simple simple simple program. tFileInputDelimited_1 (CSV file of data) --> tMap_1 (Maps data) --> tDBOutput_1 (Writes data to database). All of that works perfectly but I'm trying to log errors.

Yes, I have FlowMeterCatcher, LogCatcher and StatCatcher all up and working. Problem is FlowMeterCatcher only runs if the job finishes successfully, making this 100% useless when tracking errors, it's the only one of these loggers that tracks "count" which is really what I need but it doesn't run when the job fails.

StatCatcher let's me know which components failed, yay.

LogCatcher let's me know there is an error, yay, it lets me know the component that the error occured, normally the csv file, yay. Doesn't tell me how many lines in the csv file were processed before failing, boo!

I can't use the global variable ((Integer)globalMap.get("tFileInputDelimited_1_NB_LINE")) because apparently until the component completes this global variable is a completely useless null value. Great for statistics on working jobs, absolutely useless on error analysis.

In the GUI of Talend Studio there is a visual representation of "*99999* rows in *9.99*s" and "*999999.99* rows/s" that appear in green for completed components and blue for incomplete ones. Meaning somewhere behind the scenes Talend physically knows what number of rows have been processed and it's stored somewhere but does the user just not have any access to that information. If there is absolutely no way whatsoever to make a variable, of how many rows were processed before failure, to print to a log file, then please someone let me know and then could someone at Talend start looking into making this a thing? This kind of error logging should not be a new concept in the slightest and the fact that I can't find anything about this in other forums made me think I just missed something simple but if that was the case someone out there would've been able to answer my last post in 3 words "This is impossible".

Anonymous · ‎2020-02-02

The biggest problem you are encountering here is that you do not fully understand the flow of Talend jobs and are missing a few tools that will help you deal with your issues. The most significant tool I can offer you first is the tPostJob component. Since you are dealing with errors, I am assuming your jobs are simply ending with no clean up or post job processing. The tPostJob component runs at the end of EVERY instantiation so long as power is still being supplied to the machine running your job. Therefore you can deal with counts, and clean up using components which are linked to this component. In fact, it is a best practice to link logging start and end processes (maybe captured in child jobs with contexts supplying them logging values) to the tPreJob and tPostJob components.

Your next issue seems to be knowing where (what row) your job crashed at. Now this will take a few trial and error iterations to identify where best to take this metric from. You will deal with logging it after the tPostJob component as mentioned above. But getting the value might be possible by using globalMap values linked to components (depending on the position in the flow of the component) or you may want to build your own counts into this using something like a tMap and your own globalMap value. As I said, this is very much up to you and will likely require a bit of trial and error to get it as you want it.

It should also be pointed out that your job should be built not to fail. Obviously this will happen from time to time, nobody is perfect, but you should be building to catch any and all sorts of possible issue with your datasets. If you can capture all possible errors in the flow, then you can easily use the built-in logging. The problem you are having with this is that it is usually connected to the completion of a component and/or a subjob. If either of these fail in a way that causes the job to Die, then you will find the logging from these is not brilliant. However, if you wrote any other application that had to deal with data and it crashed when some data it wasn't expecting was fed to it, your application would not be getting great reviews. A DI job is no different to any other application. If it crashes, there is a bug that needs to be fixed.

As I said, you can mitigate for this by applying your own logging metrics if you really cannot build a job to handle all types of possible error, but having to do that is not an indication that Talend is not providing a basic piece of functionality. The basic piece of functionality has been superseded by the requirement that the job should be able to crash.

Finally, you mention that your errors appear to be occurring mainly during the reading of the CSV file. This is weird if the CSV is in a good format and your tFileInputDelimited component is configured correctly. If you are getting the job to run and produce meaningful results, I am going to assume that your tFileInputDelimited config is likely correct, but your CSV file may need to some pre-processing to check that its format is suitable. What sort of error messages do you get when your job crashes here?

Anonymous · ‎2020-02-03

The errors mainly occur during the reading of the CSV file because that is where I am purposefully making the data bad to make sure the program can catch it. The CSV file won't be coming from internal sources so I can't guarantee it will always be clean, hence my focus on making a program that can track when and where something went wrong. So a notification can be sent out and someone can manually fix and rerun it. I could easily make it skip over a line that's bad to prevent it from crashing but then it could be data from that file for months without realize it which would be a very critical mistake.

You are right, I do not fully understand the flow of Talend jobs which is why I posted this question on the forums here. I will gladly look into the tPostJob component but looking through other similar logging questions on forums I've not once seen it being used for any logging so I had not thought to use it. I would love to build the job "not to fail" as you say but then I would need to log what rows were skipped over so I can have someone go back to check the data and see what is wrong. I don't want any job to be running for months and saying all is well because it I design it to show success even when it hasn't done what I want of it.

Anonymous · ‎2020-02-03

A trick to help you out with this scenario is to import all columns in your CSV file as String. Every one of them. This will allow you to get every row into your job without a tFileInputDelimited component error. After this, you can use a tSchemaComplianceCheck component. This will allow you to trap any issues with the schema before converting the columns to the correct data types. Once you are at this point, you can carry out any business logic checks. By doing it this way, you are building the job not to crash, but can capture any issues with schema compliance (I assume these are what caused your CSV to fail) and you will have ALL of the details related to error rows.

Studio

v7.x