Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Streamlining user types in Qlik Cloud capacity-based subscriptions: Read the Details
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

Improve performance when do a lookup

Hi team,

When i run a subjob, the main table has more than 200+ thousands records. I used "Lookup Model:Reload at each row" to filter Lookup table in tMap. Also added conditions in main table. However, the whole of job still will take about 20min.

 

0683p000009Lu6k.png

I did some filers through globalmap in lookup table.

0683p000009Lu6y.png

 

In tMap, implement "Left Out Join" and "Reload at each row" lookup model.

0683p000009Ltmg.png

 

Complete to run the whole of job will take so long time(20min).

Do you have any idea to do the performance tunning?

BTW are there any good ways to use connection pool in Talend?

Thank you in advance!

Labels (2)
16 Replies
Anonymous
Not applicable
Author

Just only one row is returned for every row that the lookup is fired for.

You are right the best practice is to find some values to filter my lookup query and carry out load once which is more efficient.

However, if I can not find some values to filter my lookup query. Can I use "Parallelization" to execute the job? Then use the connection pool to control the tMSSqlConnection. How do you think? Or do you have any idea?

0683p000009Lu8p.png

Anonymous
Not applicable
Author

Using parallelization will help in this scenario, but it could also slow things down. The rule of thumb for the number of threads is the number of cores - 1. If you start with this and tweak (up and down) you will find an optimum. This also depends on how much other work your machine is doing.

Connection pooling will not help you here. The connection component will handle connections for every db component in your job....if they are hooked up to the connection component. Connection pooling would only really be of use in the situation where the job is running multiple times in parallel.  

Anonymous
Not applicable
Author

Yeah, we'd like to run the job in multiple times in parallel.  

However, seems connection pooling can not be used in TBD 6.2.1. Right?

Anonymous
Not applicable
Author

I don't understand how connection pooling would help you. The Connection component maintains the connections for the job. If you have jobs running in parallel (ie multiple instance of a job NOT a section of the job as demonstrated in your screenshots), the Connection component in each instance will handle this. Connection pooling is only really a requirement for a webservice where thousands of instances of a service can be using connections concurrently. This is handled quite easily using Spring and has been supported for  quite a while now.

Anonymous
Not applicable
Author

Now we resolve the performance issue is to add more filters in lookup table which is simplest. After i modified it that speed up 50%. It's more better than before.

Many thanks to your help 0683p000009MACJ.png

Anonymous
Not applicable
Author

Hello Icy,

Thanks for your feedback and posting your solution here.

Best regards

Sabrina

 

 

Anonymous
Not applicable
Author

Hi 

In your lookup the database server will need to calculate he plan execution each time the query is fired, because of Load at each row property in tMap.

 

The best way for lookup like this is to use a parametrized query, so the database server will calculate the plan execution for the query One time, and not for every query fire. So, to do this you'll need to replace the LKPFulfilPackageCharge Lookup table with tOracleRow and a tParseRecordSet.

 

In the tOracleRow don't concatenate filters in the query, but use parameterized query like "select columnA, columnB from Table01 where columnC = 0683p000009MAB6.pngColC".

 

That will a lot improve your Job performance.

Best regards.