Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Qlik GA: Multivariate Time Series in Qlik Predict: Get Details
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

tPostgresQLInput performance scaling issue

I’m still new to TOS and figuring out stuff as I go along. Below is the next issue I have.

 

I initially had a job that involved a PostgreSQLInput component that extracted 5 columns from one table, a tmap in which I executed some data type conversions, created some date fields and loaded the data back into PostgreSQL. That went reasonably fast for a source table with less than a 100000 records or so. Then I received a source table with 32 million records and the performance went down to a grind halt and 70 rows/sec or so after I even increased the JVM arguments to XMs2560M and XMx7000M on my laptop.

I figured it must have been the tmap component in which I performed perhaps too many conversions and so I split it out and used a dedicated tconvertype. The performance didn’t increase either.

 

Then I decided to create a test job (attached screenshot) and reduced the complexity to only the PostgreSQLInput component with a simple select of just 5 columns and a tfileoutput delimited component into a flatfile. For the 32 million dataset, it would takes ages for the job to even start processing and eventually give a java memory heap error.

I then recreated a similar job in SSIS and SSIS was able to load all 32 million records in 3 minutes and 42 secs with an avg. speed of 144,144 rows/sec.

 

I also searched the forums here and I came across another user with similar performance issues for PostgresQL. https://community.talend.com/t5/Design-and-Development/tPostgresqlInput-Query-Slow-Through-Talend-Ye...

 

In the meantime, I’ve played with some numbers and fetched 1M records still with good performance of 232,612 rows/sec. , 10M records and 240,028 rows/sec, 20M records and 76377 rows/sec, 30M records and 20065 rows/sec. More than 31M records seemed to freeze the job.

Anyone knows what’s going on?

 

 

 

Labels (2)
0 Replies