Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi All,
I'm new to Talend, I was working with Informatica ETL tool before, so please bear with me.
My purpose is to do log metrics (Start Time, End Time, Records loaded, Source Table, Target Table etc) and load to a centralized table in a database.
I want to use Joblet to do this, because I dont want to code the same ETL logging functionality each time I create a new job, I think using Joblet is easier to maintain because then I dont have to go back to each Job if there needs to be a modification.
I tried doing this using Joblet, but I'm facing hurdles as for example if I wanted to capture the records loaded to a target table
I'm using PostJob and tMSSQLrow to execute stored procedure which updates/inserts records to this ETL table.
But my solution does not work, because the table name would change for each job
tMSSQLOUTPUT_1 can be tMSSQLOUTPUT_2 in another job.
Can you please let me know on how to approach?
Thanks
Use the Activity Monitoring Console. Create the 3 tables needed for AMC to log flowmeter, logcatcher, statcatcher metrics. Turn on statistics on each component you want (be warned though that it impacts speed of execution of your job due to the excessive statistics). Monitor the rows using the flow catcher.
When you run the job, everything will be written to AMC tables. Then you can query it there.
Use the Activity Monitoring Console. Create the 3 tables needed for AMC to log flowmeter, logcatcher, statcatcher metrics. Turn on statistics on each component you want (be warned though that it impacts speed of execution of your job due to the excessive statistics). Monitor the rows using the flow catcher.
When you run the job, everything will be written to AMC tables. Then you can query it there.
so, what you are saying is to use a tSetGlobalVar to assign variables to a Global variable and to use them in the Joblet
But it doesn't serve my purpose as I still need to do the coding required to assign variables for each individual job
@RaglandA I'm afraid you will need to set some values in your job in order for them to be used by your joblet. Your table names will need to be supplied for a start. Lets say that you have your logging functionality set up in your joblet (a perfectly acceptable solution), how do you know how many tables you may be reading or writing to? You could only be reading from 1 OR you could be reading and writing to 20 tables.
As an alternative, I would suggest following what @iburtally said. Use the AMC. I have written a few tutorials on this (and am soon to write another with a better way of using it). Here is one of my older tutorials (https://www.rilhia.com/tutorials/talend-activity-monitoring-console-amc). You can see the sort of data you can get in that tutorial (and how to configure it with other tutorials linked to that). There is an issue with the AMC in that it can be quite expensive when it comes to processing time and database usage. You can get round this with a bit of a trick.
1) Configure your AMC tables
2) Create a joblet that looks like the following screenshot.....
What I am doing here is linking the tLogCatcher, tStatCatcher and tFlowMeterCatcher to tFlowToIterate components. I am then using RunIf links to decide whether to load this data to the AMC DB. The condition in the RunIfs are based on boolean context variables. This allows you to dynamically switch on and off the logging without redesigning your job (....a piece of functionality that is not natively supported by Talend).
3) Drop this joblet into your jobs.
The above will log everything the AMC supports.
I also think what @TRF said is sound. You will need to configure your logging for job specific stuff (ie table names, filenames, etc), within your job. YOu can create reusable functionality for this. Maybe some sort of child job or maybe even a joblet. But you will need to pass whatever you use some parameters for it to do what you want. As a hint, I would pay special attention to Talend PIDs (discussed in my tutorials). These will help you link your data for your job specific stuff with your AMC data.
This might seem like a lot of work, but once you have a framework configured it will save you a lot of time moving forwwards
I had a similar requirement, My reusable Joblet is querying a table and i am passing where clause from the job. I use global variable to pass where condition to the joblet.
Hi @rhall
Provide link is now working.
Please provide latest link to tutorial. It will be helpful.
Regards
Hi @Rajender,
I'm afraid that server is no longer active. I had to take it down because it was being attacked and I couldn't keep up with the security maintenance. I will try to get a blog written about the AMC as soon as I have the time. Sorry for the inconvenience.