Qlik Community

Qlik Design Blog

All about product and Qlik solutions: scripting, data modeling, visual design, extensions, best practices, etc.

Don't join - use Applymap instead

My latest blog post was on joins in QlikView (see To Join or not to Join). In it I claimed that you should avoid making joins in the QlikView script, if possible. This blog post is about a function that can help you avoid joins. It is about the function Applymap().

It is, in fact, one of my favorite functions and I do not seem to be alone in this choice. Several of the QlikView developers who regularly write on QlikCommunity seem to share this preference and have written about this function, e.g., QlikView Notes, Quick Intelligence and QVApps.

 

So, what does the function do? Well, basically it is just a lookup function – it takes one value as input, checks if this value exists in a mapping table and returns the corresponding value from the second column in the mapping table. Think of it as a translation function where the translation is fetched from a pre-defined, two-column table. A simple translation function, but you can still do a lot with it...

 

What has this got to do with joins? Everything. A very common situation in data modeling is that you need to get a single field value (per record) from a different table than the current one. Then the standard way to do this is to use a join. However, in QlikView you can – and should – use Applymap() instead. The only time that you cannot use Applymap() instead of a join is if you want to fetch more than one single corresponding value (record) from the second table.

 

Let's look at an example of how to use Applymap(): In an ERP system, the customer table is the table where all information about the customer is stored, e.g. the name and country of the customer. Further, it could be that you need the customer’s country when making a calculation in the order table. This is a field that you don't have in the order table. In SQL you would join the two tables to get country into the order table.

 

In the QlikView script you would instead first define the mapping table that maps a customer ID to a customer country as follows:

 

     MapCustomerIDtoCountry:
     Mapping Load CustomerID, Country From Customers ;

 

The second step is to use this information when reading the order table, such as:

 

     Orders:
     Load *,

          ApplyMap('MapCustomerIDtoCountry', CustomerID, null()) as Country
          From Orders ;

Tables2.png

 

The "null()" as third parameter of the function defines what it should return when the customer ID isn't found in the mapping table.

 

And with this, you have joined the field Country with the order table – without using a join. And you have done it faster and with less chance of errors. Bottom line: Whenever you know that you just want to fetch one single value per record – don’t join. Use Applymap() instead.

 

HIC

 

See more in the Technical Brief on Joins and Lookups.

82 Comments
carbal1952
Contributor II

I use ApplyMap when is relevant. It's more confortable and easy to find in the script. Obviously when the source tables are not big. If its limits were eliminated maybe JOIN will be almost forgotten. Thank to ApplyMap creator. CB.

0 Likes
341 Views
m2r4miller
New Contributor III

I think 5 ApplyMaps is the way to go, but don't hit the database 5 times - instead load the key field and the 5 other fields from the DB into a temp table, then build the Mapping Tables from the temp table using Resident loads, then drop the temp table, like so:

TempTable:

SQL SELECT

     key,

     field1,

     field2,

     field3,

     field4,

     field5

from sqltable;

Map1:

Mapping LOAD

     key,

     field1

Resident TempTable;

// --- Repeat for all the other Mapping LOAD statements ---

DROP Table TempTable;

Cheers!

0 Likes
341 Views
MVP & Luminary
MVP & Luminary

Hi Mark,

Being a tidy freak that likes code re-use; I would not repeat that mapping statement in that case but use a loop instead:

for i = 1 to 5

     Map$(i):

     MAPPING LOAD
          Key,

           field$(i)

     RESIDENT TempTable;

next

As well as loving ApplyMap I also think dollar sign expansion is pretty cool too. 

Steve

341 Views
Partner
Partner

Another thing of interest: Mapping Tables cannot be dropped. The DROP statement won't raise an error but the table won't be dropped I had a hard time figuring this out while loading many XLS file in a For each loop.

0 Likes
341 Views
m2r4miller
New Contributor III

Absolutely, but the DROP statement in my example was dropping a table - not a mapping table.

0 Likes
341 Views
m2r4miller
New Contributor III

That makes great sense! I like to use FOR loops as well. In my example I was just being lazy with field1, field2, etc. I don't expect anyone is actually hitting production tables like that, but then again you never know!

0 Likes
341 Views
MVP & Luminary
MVP & Luminary

You are absolutely correct, apart from Address fields 1 - 5 I've never come across any fields that are conveniently enough named to do this.

What I have done though is got a list of fields loaded in an Inline table, enumerated around those to build the Mapping tables (using for i = 0 to NoOfRows('Temp_MappingFields') -1 and then a Peek statement) and also whilst in the loop built a variable with all of the mapping statements in it,doing something like:


let vMapStatements = vMapStatements & 'ApplyMap(' & chr(39) & 'Map_$(vField)' & chr(39) & ', KeyField) as $(vField),'

With a bit of dollar expansion in the load statement you have some nice succinct code and you can add extra fields from the table you are mapping from simply by adding the field name to the Inline table.

Steve

341 Views
Not applicable

Thanks Henric, and others ..

Is there any comment on MAPPING table is size? if the mapping table used in ApplyMap is large if it is over 1M or 100M does it effect anything?

Ersen

0 Likes
341 Views

@ David VASSEUR

You write that "Mapping Tables cannot be dropped". I say: Au contraire! They are automatically dropped at the end of the script run.

HIC

0 Likes
341 Views

@ ebolluca

I have never hit an upper limit on mapping table size, so I assume that the limit is 2 billion distinct values (as the limit is in all other places in QlikView). Which means that if you have a large mapping table, it might consume a lot of memory during the script run, but not after that.

HIC

0 Likes
341 Views
Partner
Partner

Henric,

I mean, they cannot be dropped during script execution of course!

0 Likes
341 Views

You don't need to drop them. If you use a mapping table in a loop, like Steve Dark does above, you can have dynamic labels e.g. Map$(i) that you can use to reference the right mapping table.

HIC

0 Likes
341 Views
jason_michaelid
Honored Contributor II

HIC,

It would be nice to determine when to drop mapping tables though.  We are taught to drop resident tables as soon as they are no longer needed so as to conserve RAM during script execution. If you know you no longer need a mapping table it would be good to be able to drop it before the end.

0 Likes
341 Views
m2r4miller
New Contributor III

That is an excellent idea! Would you mind sharing a more complete example?

Thanks!

0 Likes
341 Views
Not applicable

So the first column is always the lookup column?

MapCustomerIDtoCountry:
     Mapping Load CustomerID, Country From Customers ;

   Orders:
     Load *,

          ApplyMap('MapCustomerIDtoCountry', CustomerID, null()) as Country
          From Orders ;

You look up the customerid, if it match you throw back the country - and if customerid is not in the mapping table, you throw back null? right?

0 Likes
341 Views

You've got it. /HIC

341 Views
MVP
MVP

Informative

0 Likes
341 Views
Not applicable

Thanks Henric..I am learning a lot very quickly from your posts..

0 Likes
341 Views
rsdhavle
Contributor II

Very good blog and very helpful..Thanks a lot

0 Likes
341 Views
Not applicable

Hi

What if I have a one to many situation where the ID is unique from the mapping table, but not at the source table that I apply to? So far I still get the correct caculations.

0 Likes
341 Views

-Mikael-

This should work fine. Then a specific value in the mapping table is used several times.

HIC

341 Views
Not applicable

Hi HIC,

I knew this but to confirm, If the Lookup column in the Destination is given different column name than the Source Lookup Column name than will the Script throw error or it will execute with defined third parameter or it will do the lookup and get the output into my destination table ?

Thanks,

Santosh

0 Likes
341 Views
Partner
Partner

Name of columns in the mapping table does not matter.

The lookup value will be matched to the first column and output from the second column.

Is there no match in first column the optional third parameter will be returned.

Is there no third parameter either the lookup value will be returned.

341 Views

The Applymap() function will never check the field names - it will only compare the field values. So, it will do the lookup, and what you call the two columns in the Mapping table is really irrelevant.

HIC

341 Views
Not applicable

Thanks for your Time

ok then let me confirm from my side again

If i change the CustomerID to Cust_ID  and do the reload , this will not throw any error ?? and will it give me the output ??

     MapCustomerIDtoCountry:
     Mapping Load Cust_ID , Country From Customers ;

The second step is to use this information when reading the order table, such as:

     Orders:
     Load *,

          ApplyMap('MapCustomerIDtoCountry', CustomerID, 'Unknown') as Country
          From Orders ;

0 Likes
341 Views
Not applicable

ok , I have tested it now, what you were suggesting is correct !!

Thanks for your Time and Comment hic and AndersEriksson !!

0 Likes
341 Views

The field reference in the Mapping load must of course match what you have in the input table. But what you call it is irrelevant. So, your suggested change will cause an error. But the following will not:

MapCustomerIDtoCountry:
     Mapping Load CustomerID as SomethingElse1, Country as SomethingElse2 From Customers ;

Orders:

     Load *, ApplyMap('MapCustomerIDtoCountry', CustomerID, 'Unknown') as Country From Orders ;

hic

0 Likes
341 Views
Arjunarao
Honored Contributor II

Nice blog post.

Thank you.

0 Likes
341 Views
Partner
Partner

And to confuse things further, if you change the order of fields in the mapping table you could find yourself in trouble.

The below mapping would match the CustomerID field from the Orders table with Country field of Customers table.

If by chance there is a Orders.CustomerID = Customers.Country you would get Customers.CustomerID.

If not you would get 'Unknown', probably not what you expect if new to mapping tables.

Just to emphasize field names in mapping tables have no significance at all.

Mapping tables have exactly two columns, first is the mapping key, second is the mapped value.

MapCustomerIDtoCountry:
     Mapping Load Country, CustomerID From Customers ;

Orders:

     Load *, ApplyMap('MapCustomerIDtoCountry', CustomerID, 'Unknown') as Country From Orders ;

0 Likes
341 Views
qxshower
New Contributor

I have been working on the duplication problem (due to a left join of tables) for a couple of days but no luck until I saw this great solution.

It is working in Qlik Sense too, just by replacing "from" to "resident".

Thanks!

0 Likes
341 Views