Qlik Community

Qlik Design Blog

All about product and Qlik solutions: scripting, data modeling, visual design, extensions, best practices, etc.

Employee
Employee

Dual & Exists – Useful Functions

I find that the Dual and Exists functions are useful functions that I often use in my script.  The Dual function allows me to represent a string field numerically, making it ideal for sorting purposes.  The Exists function, used in conjunction with the Where clause, allows me to only load related data into my data model.

The Dual function can be used in the script and chart expressions.  I tend to use it in the script when I need the ability to sort a string field numerically.  For example, in the script snippet below, I created age buckets and I want to be able to sort them in a list box and a chart so I used the Dual function to give each bucket a numeric value.

Dual script.png

By using the Dual function, I am able to sort the AgeBucket list box and the AgeBucket dimension in my chart numerically.

AgeBucket list box.png

Dual chart.png

The Exists function is used to determine whether a “specific field value exists in a specified field of the data loaded so far.”  I often use this with the Where clause to load data only when the specified data has already been loaded in the data model.  For example, in the script below the Age and AgeBucket fields are loaded only if the PersonID has already been loaded in the data model.

Entire script.png

Notice in the AgeTemp table that there are ages listed for PersonID 11 and 12 but since those IDs were not loaded in the data model (in the People table), they are excluded by the Where Exists(PersonID) clause.  The table below shows the data that was loaded.

Person table.png

I often blog about functions used in QlikView but I find the Dual and Exists to be two functions that I use often so that I can present data in a desired sort order and can keep my data model free of unassociated data.  If you would like to learn more or see an example application, check out my technical brief.

Thanks,

Jennell

24 Comments
Marcio_Campestrini
Valued Contributor

Jennell

Great post. Thanks for explain this for us.

Márcio

0 Likes
903 Views
robert99
Valued Contributor II

Its a very hot day and I'm tired so maybe I'm missing something

But doesn't left join do the same as exists.

And hasn't 11 and 12 already been loaded in the data model (under an inline load). Or does inline loads not count with exists.

903 Views
sudeepkm
Valued Contributor III

This is a good post. Thanks a lot.

Would it be OK to consider the below example as another scenario of exists():

We have a function called Match which we can use in the where clause but with hard coded matching values.

for example:

load City, employeeID from employee where Match(City,'Newark','Wilmington','Milford');

Now assume that Client requirement says that the matching City name would change in course of time.

then instead of hard coding it, it can be dynamically read from a spread sheet which will have those City names that clients want to include in the where clause criteria.

so now I can load a table with all those City names to be added to where clause and use Exists function in the where clause like below.

MatchCity:

load CityNames from Cities.xls; // the xls have Newark, Wilmington and Milford

Maintab:

load City, employeeID from employee where exists(CityNames,City);

Drop table MatchCity; // we no longer need the table

Thanks..

0 Likes
903 Views

I often use Exists as when loading from a qvd the load remains optimised

903 Views
sudeepkm
Valued Contributor III

Hi Bill,

while using Exists we will be using it in the where clause and I think when we load from a qvd and use a where clause then the qvd load is no more optimize. Can you please clarify from your end?

0 Likes
903 Views

Yup, a load with a single where exists remains optimised.

A load with any other kind of where is not optimised.

903 Views
robert99
Valued Contributor II

OK but if left join is used does where exist achieve anything

It seems also that left keep is speed (of loading) wise better than where exists

Are you saying that you would use where exists rather than left keep when loading from QVD tables

0 Likes
903 Views

Left joins on QVD loads can also remain optimised.

I personally like where exists, as does Jennell McIntire who wrote the blog post.

If you prefer left join then go for it.

0 Likes
903 Views
Not applicable

Thanks Jennell.

Personally I uses Exist clause during loading. I found interesting of using Dual, I will try to use it in my scripts.

- Arindam

0 Likes
903 Views
Not applicable

Thanks for this wonder effort:-)

0 Likes
903 Views
Not applicable

When I don't need to join the tables, and just restrict their values, I use more frequently Left Keep rather than Exists.

I don't know, I just like its syntax more, anyways, there doesn't seem to be any performance improvements over one another, the Qvd's load is still optimized.

Also, I don't use where on the second table while using Left Join, at least for this case, in which I'm only restricting values, It feels like double the job.

c ya

903 Views
Not applicable

Nice post Jennell,

I want to add a new comment related to exists and not exists function

I like "exists" but I don't like "not exists", I'll explain it with a simple script example

1:

LOAD * INLINE [

   id

    1

    2

    3

    4

    5

    6   

];

2:

LOAD * INLINE [

   id, flag

    1, 1

    2, 1

    3, 1

    4, 1

    5, 1

    6, 1

    7, 2

    7, 1

] where not exists(id);

join (1) load * resident 2;

drop table 2;

with this script I expected to get a table like this:

id, flag

    1, -

    2, -

    3, -

    4, -

    5, -

    6, -

    7, 2

    7, 1

but I get this one:

id, flag

    1, -

    2, -

    3, -

    4, -

    5, -

    6, -

    7, 2

Is there a reason for that or it's a bug?

Cheers,

David

903 Views
sudeepkm
Valued Contributor III

Hi David,

your question requires an explanation but the desired result can be found using "not exist" like below.

1:

LOAD * INLINE [

   ids

    1

    2

    3

    4

    5

    6  

];

2:

LOAD * INLINE [

   id, flag

    1, 1

    2, 1

    3, 1

    4, 1

    5, 1

    6, 1

    7, 2

    7, 1

    7, 3

] where not exists(ids,id);

903 Views
datanibbler
Esteemed Contributor

Hi Bill,

that is a useful comment. I wasn't aware of that up to know. It's good to know. If data is loaded only for a one-off use, however, I guess it isn't worth the effort to first create a filtered qvd and then load optimized.

@ Jennell,

great post! The DUAL() function is very useful for sorting, e.g. for months.

0 Likes
903 Views
Not applicable

Nice workaround Sudeep, thanks a lot!

0 Likes
903 Views
robert99
Valued Contributor II

I use where not exists to ensure two entries are not loaded twice in one table

Cust Number / Cust Nmae

1000 / ABC Ltd

1000/ABC Limited (should not be in the table)

So for me it is not  a bug. I only want the first customer loaded not the 2nd

0 Likes
903 Views
Not applicable

HiDavid Pertegal Miranda

the reason you are getting that table is because on table 2, the id that the LOAD, will... load, are the ones that aren't on memory already... FOR EVERY RECORD.... so when you load the pair:

7, 2   => at that precise moment, the "7" begins to exist in the field 'id' in your associative database... so when it finds the latter:

7, 1   => Qv says "Hey, I already got a 7 on the 'id' field, I ain't loading that shizznit!   "  (dropped a little slang there for the laughs...)

SO!

The combination 7,1 never really got loaded anyways, and it doesn't matter if the "join" behaves as an outer join... if you didn't load the 7,1, there is no way it is on the resulting table  

let me know if I was clear in my explanation! cheers and c'ya

ps: Sudeep Mahapatra here's your explanation ; )

903 Views
Not applicable

Understood, it's a sequencial clause,

but I still think that's not a good or expected behaviour.

I would add another parameter in not exists function that gives information about the table or tables where you want to check if the value exists or not, don't you think?

Cheers!,

David

0 Likes
903 Views
Not applicable

In a way, that is included in the fact that the function takes 2 parameters, being the first one the expression that generates the values I want to check against...

BUT, you have a point there, It would be clearer If we could specify a table to check against with. (and more "keep" like)

I'm guessing that maybe there is a reason we can't do that so easily (the structures of the tables in memory, or other overly technical mumbo jumbo! xD )

Anyways, It's just something we have to have in mind constantly.... This would be another reason to support the use of Left Keep over this kind of approach for me (I like Keep the most)

c ya

EDIT: corrected the first paragraph, I was mistaken about the order of the parameters 😮

0 Likes
903 Views
geetaalhan
Contributor

Thanks for sharing.

0 Likes
903 Views
Not applicable

The dual example is great and it will be very useful for me, but I have a problem with the Exist example. In the example you give us, the only thing that is limiting the records loaded is the left join. If you change the left join for a join, then the result will include PersonID 11 and 12, meaning the Exist statement doesn't do anything. This is because the where Exists(PersonID) clause checks for all rows loaded, including the rows loaded from the AgeTemp table (which also has a field named PersonID) At the time you join the two tables, the field PersonID has all the values loaded from the People table + the values loaded from the AgeTemp table. Someone please correct me if I'm wrong.

Regards,

Diego

903 Views
Not applicable

Yes, I agree, the example is misleading, the exists doesn't do anything in that case as you correctly pointed out.

0 Likes
903 Views
ankitbisht01
Contributor

Thank You , nicely explained

0 Likes
903 Views
beck_bakytbek
Honored Contributor

Great post, thanks for sharing

0 Likes
903 Views