Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Talend Cloud AWS EU Scheduled Outage: Starting Tues 26 May 21:00 CEST with expected completion Wed 27 May 01:00 CEST
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

Issue Clarification on TOS vs (TIS / TDQ)

Good day,
I am writing this to understand more about Talend products mentioned above. The main purpose I am asking this is because the webinar I?ve attended were not using open source, there were on TIS / TDQ. So, using TOS is it able to achieve same performance of transferring rate for (heterogeneous & homogenous environment)?
We experienced slow transferring rate using JasperETL Pro for large datasets ie: >2 millions of records. See below:

Heterogeneous environment benchmark testing (oracle to mysql)
Total Number of Rows: 1095934
Number of Rows per second: 617.03
AVG time in mints: 29.6

Total Number of Rows: 2025100
Number of Rows per second: 159.12
AVG time in mints: 212.11(3.53hr?s)

Homogenous environment benchmark testing (mysql to mysql)
Total Number of Rows: 1095934
Number of Rows per second: 4345.24
AVG time in mints: 4.20

Total Number of Rows: 2020000
Number of Rows per second: 595.88
AVG time in mints: 56.49

Does Talend having benchmark test result to share with us? I afraid there are 1 or more settings we could miss out that result poor performance?
I seek clarification on this to clear our doubts. Furthermore, would it be possible for us to get evaluation product for TIS & TDQ?
It would be highly appreciated for your input to shine our direction.

Regards,
Yoke Yew
Labels (2)
18 Replies
Anonymous
Not applicable
Author

Hi Shong,
As I replied earlier, our situation here involved 2 RDBMS. Which mean the source & target always referring to (either same / different RDBMS). Can you show me any component can work faster to transfer datasets from 1 DB to another DB.
Thanks & regards,
Yoke Yew
Anonymous
Not applicable
Author

Hello
In you case, just need replace tMysqlOutput with tMysqlOutputBulkExec, eg;
tOracleInput---tMysqlOutputBulkExec.
Best regards
shong
Anonymous
Not applicable
Author

Hi Shong,
We have tried that option earlier, but it is not allowing to execute the job with error return as follows:
Starting job mysqlbulk at 15:44 23/11/2009.
connecting to socket on port 4115
Exception in thread "main" java.lang.Error: Unresolved compilation problems:
The constructor File() is undefined
Syntax error on token ";", delete this token
at etl_performance_poc.mysqlbulk_0_1.mysqlbulk.tOracleInput_2Process(mysqlbulk.java:1872)
at etl_performance_poc.mysqlbulk_0_1.mysqlbulk.runJobInTOS(mysqlbulk.java:3441)
at etl_performance_poc.mysqlbulk_0_1.mysqlbulk.main(mysqlbulk.java:3350)
connected
Job mysqlbulk ended at 15:44 23/11/2009.

<b> the field showing mandatory for FILE_NAME </b>
Thanks & regards,
Yoke Yew
Anonymous
Not applicable
Author

Hello
<b> the field showing mandatory for FILE_NAME </b>

There is a compilation error in generated code, you must specify the file path.
Best regards
shong
Anonymous
Not applicable
Author

Hi Shong,
As I indicated earlier, we are not involving with file. It is extracting datasets from a Database and loading into another Database.
Regards,
Yoke Yew
Anonymous
Not applicable
Author

As I indicated earlier, we are not involving with file. It is extracting datasets from a Database and loading into another Database.

The file is a intermediate output file, it loads all records to the file first and then bulk insert into target db from that file. 0683p000009MA9p.png
Best regards
shong
Anonymous
Not applicable
Author

Hi Shong, Cedric,
Thanks for the hint provided. Yes we have successfully redo our benchmark testing:
- 2 millions records transfer from Oracle DB server to Mysql DB server
- 1g/ps LAN
It has reduced from earlier 2hr 30mins to 20mins.
I have 1 doubt. Is the transfer rate would give better performance using TDQ/TIS compare to TOS?
Regards,
Yoke Yew
Anonymous
Not applicable
Author

Hi Shong, Cedric, all,
Is the transfer rate would give better performance using TDQ/TIS compare to TOS?
Regards,
Yoke Yew
Anonymous
Not applicable
Author

Hello Yoke Yew
Yes, we add most of features dedicated for commercial subscription product, some of them are:
*Grid Computing* :
It's Grid computing for *project* (slit the project on several execution servers, it?s not grid computing inside a job). For each job, the JobConductor will select the best available execution server to deploy and run the job. it optimizes the scalability and availability of the integration processes by ensuring an optimal use of the execution grid, automatically distributing Jobs across the execution servers grouped in a virtual server.
*tParallelize* :
We have tParallelize that help you parallelize and synchronize the execution of numerous subjobs in your main job.(see the third screenshot)

*ParallelizingDataFlows*:
Parallel processing of data refers to the concept of speeding-up the execution of a job by dividing the data flow into multiple fragments that can execute simultaneously. The current processed data being executed across N fragments might execute N times faster than it would if processed as a single fragment. (see the second screenshot)

*FileScale* :
It?s a new big project (started more than 1 year ago) to allow high parallelization on ?big? computers (with a lot of CPU). See more information at http://www.talend.com/products-data-integration/talend-integration-suite-mpx.php
*SOA Manager*:
Run several time the same job* :
In the SOA Manager, you can set up some parameters to allow to fork the JVM for each call (very useful on EAI architecture when you have 10 calls per second).
More information at http://www.talend.com/products-data-integration/talend-integration-suite-rtx.php
Best regards
shong