Performance Issue

Anonymous — Tue, 01 Mar 2016 16:23:08 GMT

I have issue with below design. It's occupying huge resource. How the performance can be improved? Kindly help me.

>>I can remove tmap3 and tmap4, but in order to avoid the unwanted number of columns in buffer, I didn't remove that. And also, i have used some filter condition in those tmaps.
I have 8 columns like A,B,C,D,E,F,G,H. Filtering the records on C,D,E,F,G(tmap3 & tmap4) and I'm taking only A,B,H to reference buffer(tmap1 & tmap2). Is it the right approach? Please correct me, if I'm wrong.
>>tFileInputDelimited_2 & tFileInputDelimited_3 were same files. If I extract that as a single file(one tFileDelimited), I cannot used that as a reference in two places. Is there is any approach to handle this? Extracting the file only once, will increase the performance.
>>Job was very resource consuming. I'm getting 3 million records from source & 2.5 million from each reference. Allocated Xmx16384. I cannot allocate this much RAM to a single job. Need help on this.
>>Sorting and Removing duplicates takes heavy time? I used sort on disk option. But still it's very resource consuming. Any other ways to do it efficiently?
Someone, please help me out.
Thanks

Re: Performance Issue

Anonymous — Wed, 02 Mar 2016 03:49:53 GMT

Hi,
We have replied to your another topic: https://community.talend.com/t5/Design-and-Development/Performance-issue-with-below-design/td-p/89491.
Could you please take a look at it?
Best regards
Sabrina

topic Performance Issue in Talend Studio

Performance Issue

Re: Performance Issue