Skip to main content
Announcements
Qlik Connect 2024! Seize endless possibilities! LEARN MORE
Jennell_McIntire
Employee
Employee

Some time ago I wrote a blog on Mapping functions and described how they can be used to replace or modify field values when you run the script.  But how do you know when to map versus join the data in your data model?  Mapping works well when you need to look up a single value in another table.  For instance, you may have a products table with product data like the table below and you want to add the product category name to that table.

Products.png

The product category name is in another table that looks like this:

ProductCategory.png

Now you can add the ProductCategory field to the Products table by doing a join and that would work fine but you can also add the ProductCategory field by simply mapping.

Using a join:

Left Join.png

Using a map:

Mapping.png

Since we only want to add one value to the Products table, mapping is a safer option.  With this small sample data, either will work but sometimes when you have a large data set, you have be cautious when using joins.  You need to watch out for new records being added to the table as a result of the join thus potentially changing calculations.

While both a join and a map can work to combine data from two tables, in cases where only one value needs to be added, choose to map.  It is an easier approach and it reduces the chance of errors being made in your data model.  Now I am not saying that joins are bad and should not be used because that is not the case at all.  I am simply stating that mapping should always be used versus a join when you only need one value.

Thanks,

Jennell

15 Comments
Not applicable

Nice post Jennell, I have question around performance of both option on large dataset.

If we have an option to choose one of them, which operation will perform well on large data set given that both operation is giving correct output?

Regards,

Anosh Nathaniel

0 Likes
17,745 Views
Gysbert_Wassenaar

Good post. Applymap deserves to be brought to attention again.

For those who's attention is caught, read these two blog posts as well:

17,745 Views
Anonymous
Not applicable

I use ApplyMap() a lot.

17,745 Views
rwunderlich
Partner Ambassador/MVP
Partner Ambassador/MVP

Another place where ApplyMap stands out is the ability to assign a default for unmatched data:

ApplyMap('ProductCatMap', ProductId, 'Uncategorized') as ProductCategory

or even a more complex default:

ApplyMap('ProductCatMap', ProductId, if(ProductId<100, 'Warehouse Stock', 'Uncategorized')) as ProductCategory

17,745 Views
mrooney
Contributor III
Contributor III

There is also a workaround to map more than one field in only one Mapping Load.

Mapping Multiple Fieds with only one Mapping Load

0 Likes
17,745 Views
stevedark
Partner Ambassador/MVP
Partner Ambassador/MVP

gwassenaar has beaten me to posting a link to my own blog post (thanks Gysbert), but in there you will find what I like about ApplyMap, and as Rob mentions, it is the flexibility it offers.  It always makes me smile when I use an ApplyMap in a MAPPING load statement or use ApplyMap as the third parameter of an ApplyMap statement.  Nesting in this way is something you would not be able to do with JOIN alone.

Steve

17,745 Views
hic
Former Employee
Former Employee

anoshnathaniel: In my experience, Applymap() always outperforms a Join. And with less risk of errors.

HIC

12,348 Views
karimeddini_sla
Partner - Contributor III
Partner - Contributor III

While I normally agree with the great advice HIC & Rob W. provide and follow the general practices the Qlik blogosphere advocates, I find that the claims of Mapping being less error-prone than Joining do not fully represent the issue (this comment has nothing to do with performance, only the "error-prone" claim). From the perspective of a business analyst and/or somebody not comfortable with relation algebra (e.g.: SQL), I can see how Mapping seems better. Now, I understand that QlikView is associative (as opposed to relational) and intended more for business analysts than for developers, but from the my perspective as both a general and database developer, Mapping is simply another option with its own advantages and disadvantages. Specifically, the often-mentioned problem case of multiple rows when joining is also a problem for mapping; mapping, instead, chooses 1 of the many at random (perhaps it is mapping load order), which can be just as much of a problem for calculation accuracy and can mask the need for grouping and aggregating your mapping table.

0 Likes
12,348 Views
rwunderlich
Partner Ambassador/MVP
Partner Ambassador/MVP

Excellent points Peter. I would respond to the "error-prone" claim; mapping guarantees only a single match (which by the way is the first match in load order). It's not necessarily "better" but offers a degree of certainty which is desirable in many situations.

One of the challenges we often face in BI is translating the  current available data model into a model accurate for historical analysis. For example, the operational ERP only cares about "what is the price today, when I create this order", whereas the historical analysis needs to know what  the price was when the order was cut which may or may not be be recorded in the fact data.

Features such as Join, ApplyMap, or IntervalMatch address different scenarios of data availability, currency and operational practices. I think it's good to know and understand all of these capabilities as it provides the options to create useful analytics with the given available data.

12,348 Views
hic
Former Employee
Former Employee

Mapping is indeed just another option with its own advantages and disadvantages, and should not always be used. There are plenty of cases where a join is better.

However, I still claim that Applymap() is less error-prone, and I base that statement solely on my empirical experience: I have seen countless cases where joins have been used to get one value from a different table, and the developer "knows" that it is a one-to-one relationship. But it turns out that it isn't, and as a result QlikView gets an incorrect number of records, with wrong numbers as a consequence.

In a situation where the developer knows that there should only be one value, Applymap() becomes the best choice, since you then use this à priori knowledge.

HIC

0 Likes
12,348 Views