Performance - 12 millions rows in tFileInputdelimited
Every hour a new log file is created and has about 12 million entries .The ETL Job runs every hour and has to read this 12 million rows from the file and load to a staging table.
I'm facing huge performance issue.
I'm not able to upload the image of the talend job, so here is the flow:
1) tFilelist
2) tFileinputDelimited
3) tconvert type
4) tmap
5) insert into the table.
It is taking about 3 hours to process the 12 million rows in a single log file. As it is crossing the 1 hour limit. The log files are getting accumulated and there is no way for this ETL Job to process all this backlogs.
Does any one has any suggestions on how to improve this process ?
Thanks
Hi
Usually tmap and tconverttype are the bottleneck of one job.
First, use 'store on disk' of tMap.
Second, delete tConverttype and try to convert type in expressions of tMap.
Besides, using bulk component(e.g. tOracleOutputBulkExec) will optimize the performance.
Regards,
Pedro