Announcements
See what Drew Clarke has to say about the Qlik Talend Cloud launch! READ THE BLOG
cancel
Showing results for
Did you mean:
Contributor III

## Which is more efficient - INNER JOIN or WHERE

A question was brought up during our SCRUM call this morning, and no one could give an answer. Lets say you have two tables:

Approved Products:

 Product_Number 123abc 456def

 Product_Number Product_Name 123abc Shoe 456def Shirt 789ghi Pants 101112jkl Belt

And you want the resulting table to be the All Products table, but only with the Approved Products in it.

Result:

 Product_Number Product_Name 123abc Shoe 456def Shirt

The two solutions we were debating between are:

or

Solution #2: First create a comma separated list of the values from Approved Products, vList. Then use that value in the where clause - i.e. LOAD * RESIDENT All Products WHERE Product_Number MATCH(Product_Number, \$(vList))

Both of these solutions will get me the same result, but I am curious as to which of these are the optimal solution. For some reason, joining out to a table seems like it would require more resources, but that is literally just a feeling I have. Huge bonus points if you even know why one is more optimal than the other.

Kind regards,

Gage

Labels (3)

• ### General Question

In my opinion, #2 would generally be the wrong way to go about it. The "correct" equivalent for #2 would be to load Approved Products first and then load  All Products using  "Where Exists(Product_Number)".

TempProducts:
From Approved_Products ...;

All_Products:
From All_Products.qvd (qvd)
Where Exists (Product_Number);

Drop Table TempProducts;

This is especially effective if All_Products is loaded from a QVD as Where Exists() will be an optimized load.  That is the most resource efficient solution because where exists is fast and Qlik will use the least memory by filtering as it builds the table.

If you must filter All_Products from a resident table, using Inner Join is the best approach.

Inner Join (All_Products)