Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Save $650 on Qlik Connect, Dec 1 - 7, our lowest price of the year. Register with code CYBERWEEK: Register
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

HELP. join with table with many records

Hi

I have to read 2 tables in join and update a third one. Simple !!

But when talend performs a join on tMap, it first reads all records in the second table and then runs. this table can have hundreds of thousands of records. Is it possible not to read all the records first? Can you help me?

 

Best Regards

Benny

Labels (2)
1 Solution

Accepted Solutions
Anonymous
Not applicable
Author

There are a couple of ways to approach this problem.

 

If the tables are from the same database, you should join in one query. Always filter your data at the database where possible. It is incredibly inefficient to load all of your data into Talend to filter it if you do not need to.

 

If the data is from different databases then it is important to filter out as much data as you can in your initial queries. Since you are comparing one row with potentially thousands (hundreds of thousands?), removing even a handful of rows from your Main or Lookup will help with performance. You also need to think about whether it is more efficient to bring all of your Lookup data in at once (at the beginning), or fire a Lookup query for every Main row that arrives? If you want to fire a query for each Main row that arrives, you can filter the query using data from the Main row. This will massively reduce the number of rows returned BUT comes at the expense of potentially hundreds of thousands of queries. To do this you need to select "Reload at each row" from the "Lookup Model" dropdown in the Lookup settings within the tMap. I have a tutorial which shows how this is used (although for a very different problem) here: https://www.rilhia.com/quicktips/quick-tip-row-multiplication

 

View solution in original post

2 Replies
Anonymous
Not applicable
Author

There are a couple of ways to approach this problem.

 

If the tables are from the same database, you should join in one query. Always filter your data at the database where possible. It is incredibly inefficient to load all of your data into Talend to filter it if you do not need to.

 

If the data is from different databases then it is important to filter out as much data as you can in your initial queries. Since you are comparing one row with potentially thousands (hundreds of thousands?), removing even a handful of rows from your Main or Lookup will help with performance. You also need to think about whether it is more efficient to bring all of your Lookup data in at once (at the beginning), or fire a Lookup query for every Main row that arrives? If you want to fire a query for each Main row that arrives, you can filter the query using data from the Main row. This will massively reduce the number of rows returned BUT comes at the expense of potentially hundreds of thousands of queries. To do this you need to select "Reload at each row" from the "Lookup Model" dropdown in the Lookup settings within the tMap. I have a tutorial which shows how this is used (although for a very different problem) here: https://www.rilhia.com/quicktips/quick-tip-row-multiplication

 

Anonymous
Not applicable
Author

I'm just a fool. The table are in the same database and I can work with SQL, view, index, etc.

 

Thank you for your suggestions

 

Benny