Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 
mjtaft2017
Partner - Creator
Partner - Creator

Qlik Sense limit left join of SQL table to only keys in resident table

I want to join some additional fields to a resident table and the additional fields are coming from a DB2 database.  Is it possible to somehow limit the sql database query to only the keys in the resident table?  It seems like Qlik Sense is querying the whole database table before it is trying to do the join -- I am not sure about the order of operations in Qlik Sense compared to SQL

Vehicles:
LOAD
Key,
BuiltDate,
SoldDate
FROM 'lib://SharedQVD/Built_Sold.qvd' (qvd);

LEFT JOIN(Vehicles)

LOAD
C_MKT as Market,
C_FAM as Family,
Key,
I_MOD_YR as [Model Yr],
C_ZONE_SOLD as [Sold Zone];

SQL Select 
Key,
C_MKT,
C_FAM,
I_MOD_YR,
C_ZONE_SOLD
FROM Warehouse.ORDR with ur;

 

 

Labels (3)
1 Solution

Accepted Solutions
marcus_sommer

You need to apply the restriction on the SQL side because filtering the data on the Qlik side means that the whole table will be pulled at first.

If there are not too much KEY values you could try it with a where in() clause or maybe changing the condition from single values to ranges of values, for example KEY >= condition or maybe Date between condition1 and condition2 (even if you couldn't specify an exact condition it might be an useful workaround to reduce the amount of records).

The first mentioned where in might be create with something like:

Vehicles:
LOAD
Key,
BuiltDate,
SoldDate
FROM 'lib://SharedQVD/Built_Sold.qvd' (qvd);

Keys: load concat(fieldvalue('Key', recno(), ',') as Keys autogenerate fieldvaluecount('Key');
let vKeys = peek('Keys', 0', 'Keys');

LEFT JOIN(Vehicles)

LOAD
C_MKT as Market,
C_FAM as Family,
Key,
I_MOD_YR as [Model Yr],
C_ZONE_SOLD as [Sold Zone];

SQL Select 
Key,
C_MKT,
C_FAM,
I_MOD_YR,
C_ZONE_SOLD
FROM Warehouse.ORDR where Key in($(vKeys)) with ur;

Depending on the kind of values and/or the syntax-rules of your database you might need some adjustments to this logic to add quotes or something similar.

If there are really many KEY values - you just talked about records and not about the number of KEY values - the in() logic won't work or might be too slow. I assume that there are limitations and you couldn't use a million values within the in().

If in() and/or the above mentioned value-ranges aren't applicable you could store the KEY list as a table within your database or maybe as a csv-file which is imported into the database and which is then used for an inner join within the SQL statement.

- Marcus

View solution in original post

7 Replies
Channa
Specialist III
Specialist III

add where clause to restrict or do inner join  

Channa
pradosh_thakur
Master II
Master II

You can query whatever the column you need and don't need to write the extra column , even if you do you can change the naming and dont use them in the preceding load statement on top of SQL select.

You can use the qlik default associative property instead of join if it is not necessary.

Learning never stops.
mjtaft2017
Partner - Creator
Partner - Creator
Author

Thank you Pradosh - I do need to join these as I want all of these fields in 1 table.

My question is due to there being several billion records in the database table and I only want to target about 1 million to grab these additional dims.

I have not seen where you would put criteria to limit joining a database table to a resident load.  The qvd is already a product of other joined database tables so I cannot easily repeat all of the "where" conditions that went into creating that qvd.  Does that make sense?

If I was to put a condition in - is it possible to put a "where exists(Key)"  either before the semicolon on the LOAD statement or before the "with ur" on the SQL statement?  It seems that Qlik Sense ought to be looking for the keys anyway - but as I said previously - I am aware of how SQL performs the logical order of operations but not how Qlik Sense does this.  What the breakdown of the Left join process is when the first table is resident and the 2nd table is external database

mjtaft2017
Partner - Creator
Partner - Creator
Author

Thank you for your reply Channa - 

Where would you put that "where" clause 

Here??

LOAD
C_MKT as Market,
C_FAM as Family,
Key,
I_MOD_YR as [Model Yr],
C_ZONE_SOLD as [Sold Zone]
Where exists(Key);                                      << Here?? 

OR 

SQL Select
Key,
C_MKT,
C_FAM,
I_MOD_YR,
C_ZONE_SOLD
FROM Warehouse.ORDR
Where exists(Key)with ur;      << Here ??

Not sure how QlikSense performs operation steps of the join under the hood

pradosh_thakur
Master II
Master II

Hi

 

See if you don't use a where condition Qliksense will query all the column mentioned and and rows  You need to put a where condition in the SQL statement so that it puts a condition there .

 

If i were you i would put a condition in the sql statement first and run that code snippet and check the time taken to load and then use the where in the load statement and check the time taken to load . I would choose the one which is optimized and taking less time. As you said you have billion of rows i am not sure which one would be faster, You can check that and let us know a well.

Learning never stops.
marcus_sommer

You need to apply the restriction on the SQL side because filtering the data on the Qlik side means that the whole table will be pulled at first.

If there are not too much KEY values you could try it with a where in() clause or maybe changing the condition from single values to ranges of values, for example KEY >= condition or maybe Date between condition1 and condition2 (even if you couldn't specify an exact condition it might be an useful workaround to reduce the amount of records).

The first mentioned where in might be create with something like:

Vehicles:
LOAD
Key,
BuiltDate,
SoldDate
FROM 'lib://SharedQVD/Built_Sold.qvd' (qvd);

Keys: load concat(fieldvalue('Key', recno(), ',') as Keys autogenerate fieldvaluecount('Key');
let vKeys = peek('Keys', 0', 'Keys');

LEFT JOIN(Vehicles)

LOAD
C_MKT as Market,
C_FAM as Family,
Key,
I_MOD_YR as [Model Yr],
C_ZONE_SOLD as [Sold Zone];

SQL Select 
Key,
C_MKT,
C_FAM,
I_MOD_YR,
C_ZONE_SOLD
FROM Warehouse.ORDR where Key in($(vKeys)) with ur;

Depending on the kind of values and/or the syntax-rules of your database you might need some adjustments to this logic to add quotes or something similar.

If there are really many KEY values - you just talked about records and not about the number of KEY values - the in() logic won't work or might be too slow. I assume that there are limitations and you couldn't use a million values within the in().

If in() and/or the above mentioned value-ranges aren't applicable you could store the KEY list as a table within your database or maybe as a csv-file which is imported into the database and which is then used for an inner join within the SQL statement.

- Marcus

mjtaft2017
Partner - Creator
Partner - Creator
Author

Thank you for your reply Marcus - 

That might just work if I load incrementally by month.  I am going to see if I can work that out.  I will reach out if I run into a question with your proposal.  

Great idea!