Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Qlik and ServiceNow Partner to Bring Trusted Enterprise Context into AI-Powered Workflows. Learn More!
cancel
Showing results for 
Search instead for 
Did you mean: 
vidya821
Creator
Creator

Performance improvement - lookup with tmap

Dear All,

 

I am using tmap to lookup two dfferent databases and using exp i am narrowing the lookup values however i am getting a performance of just 7-8 rows per sec.

I want to process around 1 Million records.

Attached is the design, is there anything more than can be done to improve performance.

Note: All indexes are in place.

 

Thanks

Vidya

Labels (2)
1 Solution

Accepted Solutions
Anonymous
Not applicable

OK, your problem is the reload at each row. I suspect that your query being fired is looking through a lot of data and you are firing it a million times. That is guaranteed to be slow. From your diagram it looks like the main source of data and the lookup query are from the same database. If that is the case, do the lookup in the main query. There is absolutely no point joining in Talend if your data starts off in the same database. If it is not in the same database it might make sense to add the lookup data to your main data's database somehow.

 

You will not get round this with simple tweaks I'm afraid. 1 million queries is a lot of queries. You have to deal with the latency of building, sending and receiving the data for every single row in your main source.  

View solution in original post

10 Replies
Anonymous
Not applicable

It looks like your Main row is quite slow. Can you test this by removing the other components and testing with just a tLogRow. Also, can you show us your DB component configuration, both Basic and Advanced.

Anonymous
Not applicable

There's quite a few things that can cause a job like this to be slow. You might try creating a test job with just the database connection and a tLogRow (no tMap) and see if it is significantly faster. If it isn't, then tMap isn't the issue.

 

If tMap is likely the issue, try rewriting your select query so you don't need to use an expression filter. You can include the context variable from globalmap() in a query statement; that way, the db's query engine is doing the work, rather than tMap (which is necessarily going to be slower, because it processes one row at a time, similar to a cursor).

 

Hope this helps.

vidya821
Creator
Creator
Author

Hi Rhall,

 

Attached is the job with just logrow and db connection and also basic and Advanced settings.

 

Thanks

 


Performace issue1.png
vidya821
Creator
Creator
Author

db connection with tlogrow is quite faster.
cannot use globalmap in db because it takes input from one db and use it another db to limit the rows for lookup..this is because the filter constraint for second db changes wrt each row from first db..
i need to use tmap in this case
Anonymous
Not applicable

Go back to your original job and switch on the "Use Cursor" tick box. I think you will see an improvement.

vidya821
Creator
Creator
Author

Hi, ticking "Use Cursor" had no impact on performance, its still the same.

 

Do you recommend any Cursor Size, i tried from the range 100- 10000

Anonymous
Not applicable

How is your tMap configured? Can you show us a screenshot of this configuration please?

vidya821
Creator
Creator
Author

here is the tmap config and db query

With cursor size of 100, the performance was slightly improved from 7 rws/s to 11 rws/s.

Can it be imporved more ?


Performace issue.png
Anonymous
Not applicable

OK, your problem is the reload at each row. I suspect that your query being fired is looking through a lot of data and you are firing it a million times. That is guaranteed to be slow. From your diagram it looks like the main source of data and the lookup query are from the same database. If that is the case, do the lookup in the main query. There is absolutely no point joining in Talend if your data starts off in the same database. If it is not in the same database it might make sense to add the lookup data to your main data's database somehow.

 

You will not get round this with simple tweaks I'm afraid. 1 million queries is a lot of queries. You have to deal with the latency of building, sending and receiving the data for every single row in your main source.