Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Write Table now available in Qlik Cloud Analytics: Read Blog
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

Postgres data reading stream performance & memory used

Using Big data enterprise edition. Help me with the below job scenario.
PostgresInput------------------------------------>File
All the ETL transformation was written as SQL query and called inside postgressql input component. SQL output was written to a file in the same job.
For triggering the job, we are using ETL server. Postgres DB was present in different server.
Following questions hits my mind.
a)Ran the job with datasize around 4 gb. Job failed, if i allocate JVM less that 4gb. Talend keeps all the data(4gb) in memory(RAM). It look like pipeline doesn't work in this case, or postgres pipeline streaming doesn't work? 
If some bytes of data was read, it should write the data in file and release the buffer. right?
I'm pushing all the load to DB, which was present in different server. ETL server is used only for I/O process. Why it occupies more RAM in ETL server? Am i missing anything? Please suggest.
Thanks
  
Labels (2)
4 Replies
vapukov
Master II
Master II

on Advanced settings - checkbox "Use Cursor" 
after adjust number of rows to send
for final performance affect 2 "variables":

number of rows
size of data
You can play with both - reduce number of rows for bigger data (BLOB, TEXT column) or increase it 
Anonymous
Not applicable
Author

Thanks.
I can find the cursor option.
We are using Enterprise database Postgres(EDB) instead of open source.What is the difference between these tpostgressql and tpostgresqlplus component? 
vapukov
Master II
Master II

don't know, I use only OpenSource version of Postgres, so not familiar with EDB features
Anonymous
Not applicable
Author

Thanks. I have set Cursor as 100000. Found the huge improvement in performance. 
With 1GB of JVM, job completed succesfully without any memory issues.
In one case, reading 1 million records from postgres, but i found only 0.2 million were read and job completed successfully. What would be the issue?
Thought because of AUTOCOMMIT set to TRUE. But it was enabled in CODE.