Hi,
I would like to know how Talend Open Studio handle huge amount of data.
For instance, if I want to select a 20Go MySQL table, do some transformations on the fields, and put the data in another database, how is Talend Open Studio doing it ?
In my example, I have a tMysqlInput --> tMap --> tMysqlOutput
I tried the 2 following things :
-> Limit the jvm memory usage with the option : -Xmx6144M (6Go memory usage)
-> In the tMap componant, I specified the "temp data directory disk", so I guess Talend is writing data here to free some memory.
By doing this, my job seems to work. If I don't do this, the job crass, returning a memory exception.
But what is Talend doing exactly ?
Does it just stored the data in a temporary file and do read / write access on this file ?
Does it uses a specific algorithm to store temporary the data ?
Can it crash anyway if there is too many data ?
Does the Talend Platform for Big Data is a lot more optimized than the Open Studio version ?
I know it's a lot of questions, but if someone has a few answers, it would really help me.
Best regards,
Bertrand.
Hi Bertrand,
I would look at the job differently, as per previous post, you can think of writing data from source to flat file in chunks and then load from file to database, remove old and loop again. This also would give you some control over job execution and will help to interface in case of failure.
Thanks
Vaibhav
Hi,
I have Source table mysql that table have huge data more than 15000000 rows. i am using tmysqlinput component but unable to extract data from table.Connection is failed every time.Is their any heap size issue or extract please help
simply using
tmysqlinput-tmap-tpostgresoutput
Thanks,
Naveen
hi,
i am facing the same issue i have 6 billion records to load and the input is SQL SERVER.i did not find enable stream option in tmssqlinput component and also when i increased jvm size to -Xmx1024m there is no increase in loading speed.What should be done to increase the data loading speed in talend