Re: Performance on a job - Page 2 - Qlik Community

Anonymous · ‎2008-06-10

Hi,
For me, Talend is a very good solution but now, I try to do performance tests. And, my results are disastrous.
I join a screen of my job which discribe my business rules.
My project is to replace a loader done with Access.
With my current loader, the execution time of this job is 4minutes and 30 seconds.
With Talend, the execution time of this job is 1 hour and 28 minutes.
How can I improve the performance ?
Thx.

Anonymous · ‎2008-06-11

Maybe I'm missing something - sorry to intrude.
However, it was mentioned earlier that one of the databases in Oracle, yet I see no tOracle components in the job. How come?

Anonymous · ‎2008-06-11

You don't see tOracleInput because I use the component ODBC to connect to the data warehouse Oracle. The tInput component is DWH on my screen.
For Mhirt : You can load at 75000 rows/s on what type of database ? Oracle ? You display the statistics to see the performance or not ?
Thx

Anonymous · ‎2008-06-11

sorry suzchr, I get 75000 rows / s with file to file. My message was for Maverick (he is limited to only 3000 rows persecond and I don't understand why)
For Databases, the best performance are obtained with bulk components (not available for Access).
Otherwise, it's mainly relative to Autocommit tweaking
In Java you can show statistics, it don't affect much performance.
In Perl, it has more impacts..
HTH,

Anonymous · ‎2008-06-12

I'll checked again.
I was exagerating with 2000 rows/sec :s
It's 7000 rows/sec when reading from a delimited flat file with following specs :
- Number of rows : 7 000 000
- Number of columns : 12
- My job write to an excel file, If I write to a delemited file, the number of rows/sec is growing to 15000
- HDD speed : 7200 RPM
- I have an antivirus, but I cant disable it (SBS behind

)
But nevermind, I dont have any problem with this

Anonymous · ‎2008-06-12

mhirt, I have a question for you ! I see that your status is Talend Team. Do it significate that you work for Talend company ?

Anonymous · ‎2008-06-12

I have an other question according to my job. To improve performance, I need to modify the commit on tAccessOutput. However I don't know if it's better with a big commit (each 20000 rows for example) or a little commit (each 10 rows for example).
I use an computer with 1Go of ram memories and my process write 422188 rows in my database Access.

Anonymous · ‎2008-06-12

Somebody know how the commit is done if I write 0 like value in commit every ?

Anonymous · ‎2008-06-12

suzchr,

I have a question for you ! I see that your status is Talend Team. Do it significate that you work for Talend company ?

Yes I'm working for Talend ! 🙂

However I don't know if it's better with a big commit (each 20000 rows for example) or a little commit (each 10 rows for example).

In general, it's better with a big "commit every" value, but it not as simple as that.
You may have better performance with a commit every of 40000 than with a comit every of 50000.
You have to make tests to find the better value.

Somebody know how the commit is done if I write 0 like value in commit every ?

With 0 or empty, there won't be any commit at all.
HTH,

Anonymous · ‎2008-06-13

Thank you for all your answers !
I realise benchmark in my job and after I will give my results.
My first impression is with Access the most efficiant is to commit every 1 values. It's rare but in my case it's like this.

Anonymous · ‎2008-06-16

So I am realising my benchmark and the result are not good...
In fact I realize two types of benchmark. The first is the job complete and the best time is get with a commit value on the tAccessOutput at 10. The best time is 22 minutes versus 9 minutes with my loader in Access.
Then, I create the same job without write on Access (I delete the tComponentOutput). The time is 4min 40 seconds. This is very good.
Then, I create a job where I just write on Access. I write 500 000 lines generated by the tRowGenerator. The best time is get by a commit value at 125 000. This best time is 5 minutes 39 secondes. This is also efficient.
All in all, I create two job one which extract only the data and finish by the tBufferOutput component and an other which get the data of the job and write on Access. But the performance are bad. After two hours I have just write 125 000 rows.

How can I do to improve my performance ? Someone has a good idea ?

Performance on a job

Other

Talend Data Integration