300k data process for duplication using tMatchgroup component . i am checking store on disk option to improve the performance ,but i am not seeing any improvement and i have a doubt about max buffer size value is there any limit for that and how it stores in store on disc by column wise or row wise .please give some suggestion to improve performance for job.
The buffer size will grow as big as your system can support it. (i.e. ram available). However, it might be best for large dataset, to store them on file system and then read them as needed. you'll have to experiment to find the perfect mix.