Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi,
I´m trying to create a new table with 2 columns.
That table is the result of join 11 tables with (15 cols/table and 20.000.000 rows/table) .
The read-speed started excellent (around 40.000rows/sec) but it started to run much slower. The last 2 hours is running with 4000rows/sec.
How can I increase that speed?? Below attached the entire process and its current speed.
Thank you.
Hello,
What's the size of your RAM? What's talend product are you using?
Generally speaking, the followings aspects could affect the job performance:
1. The volume of data, read a large of data set, the performance will degrade.
2. The structure of data, if there are so many columns on tDBRow, it will consume many memory and much time for transferring the data during the job execution.
3. The database connection, the job always runs better if the database is installed on local, if the database is on another machine, even you are on VPN, you may have the congestion and latency issues.
Best regards
Sabrina
Hi xdshi,
Thanks for your answer.
I´m using Talend 7.3.1.20200219_1130.
RAM is 8GB on in a server processor with 6 cores.
Memory configuration in Talend is: Xms2049M, Xmx8192M.
The 11 tables have around 80.000.000 rows/table and each table has in between 20GB and 40GB.
The problem I have is that running gives me 2 different messages:
and
Thanks for your help!!
Hello ,
Does this perf issue occur for 1 or all the jobs ?
if all the jobs :
Could you add your workspace on antivirus exclude list ?
Could you disable the drive indexing made by the operating system ?
If only this job:
What is the allocate heap size ? (Xmx value on ini file)
Where is located the db: on a local drive or a remote one
Can you check if the performances impove if you set parallelization on the job (right click / enable parallelizationn)
Hi tsesdl,
Thanks for your answer! I´ve tried with only one table and also occurs.
System under my Talend is running is:
If checking JVM memory assigned with command prompt:
java -Xshowsettings: vm
VM settings:
Max. Heap Size (Estimated): 1.78G
Ergonomics Machine Class: client
Using VM: Java HotSpot(TM) 64-Bit Server VM
In Talend, before running, I´m always changing in "Run" tab memory to:
-Xms256M
-Xmx8016M
However, the file TOS_DI-win-x86_64.ini contains:
-vmargs
-Xms512m
-Xmx1536m
-Dfile.encoding=UTF-8
-Dosgi.requiredJavaVersion=1.8
-XX:+UseG1GC
-XX:+UseStringDeduplication
-XX:MaxMetaspaceSize=512m
Running job with parallelization, I get this error message:
"Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit exceeded"
Thanks for your help!