Skip to main content
Announcements
See what Drew Clarke has to say about the Qlik Talend Cloud launch! READ THE BLOG
cancel
Showing results for 
Search instead for 
Did you mean: 
pkelly
Specialist
Specialist

Mapping Load / Apply Map versus Left Join

Mapping Load / Apply Map was functionality which was shown to us during our initial training session with QlikTech many moons ago but it is not something that I have used within my scripts instead using left joins to get say description of a code into my QVD table.

Not really sure what the benefits or differences are between using mapping and left joins and would be grateful if someone could elaborate on this for me...

Many Thanks

15 Replies
Not applicable

You are wrong puneet. We can still get the common filed column in the new consolidated table.

patricio
Contributor III
Contributor III

I want to share an additional benefit that I found in the ApplyMap if compared with the left join when modifying a script.

Imagine that you have the Color Code in your table and one day you decide to add the description that is in the color table. If you use a left join you risk to add additional records if for any reason there are duplicates in your Color table.

By using the applymap you limit your testing after the change because you are 100% sure that you are not adding additional records even if there are duplicates in your color table.

I agree with you that you should not have duplicates in your Color table but if instead of the Color is any other Excel table maintained by users without validation the applymap will become less risky.

AnandKu
Employee
Employee

the link is dead, here is the saved copy

marcus_sommer

Here an update to the matter:

let start = now();
t: load recno() as ID, mod(recno(), 4) as Status, chr(ceil(rand() * 127)) & chr(ceil(rand() * 127)) as X
autogenerate 75000000;
let noRows = noofrows('t'); let end = now(); let time = time('$(end)' - '$(start)'); trace '$(time)';
c: load $(noRows) as Records, -1 as No, 'raw-data load' as Type, '$(end)' as End, '$(start)' as Start, '$(time)' as Time autogenerate 1;

let start = now();
store t into Dummy.qvd (qvd); drop tables t;
let end = now(); let time = time('$(end)' - '$(start)'); trace '$(time)';
c: load 0 as Records, 0 as No, 'raw-data store' as Type, '$(end)' as End, '$(start)' as Start, '$(time)' as Time autogenerate 1;

m: mapping load recno() - 1 as Status, chr(recno()+64) as Return autogenerate 4;
l: load recno() - 1 as Status, chr(recno()+64) as Return autogenerate 4;

let start = now();
t: load * from Dummy.qvd (qvd);
let noRows = noofrows('t'); let end = now(); let time = time('$(end)' - '$(start)'); drop tables t; trace '$(time)';
c: load $(noRows) as Records, 1 as No, 'no processing' as Type, '$(end)' as End, '$(start)' as Start, '$(time)' as Time autogenerate 1;

let start = now();
t: load *, applymap('m', Status) as StatusX from Dummy.qvd (qvd);
let noRows = noofrows('t'); let end = now(); let time = time('$(end)' - '$(start)'); drop tables t; trace '$(time)';
c: load $(noRows) as Records, 2 as No, 'applymap' as Type, '$(end)' as End, '$(start)' as Start, '$(time)' as Time autogenerate 1;

map Status using m;
let start = now();
t: load * from Dummy.qvd (qvd);
let noRows = noofrows('t'); let end = now(); let time = time('$(end)' - '$(start)'); drop tables t; trace '$(time)';
c: load $(noRows) as Records, 3 as No, 'map using' as Type, '$(end)' as End, '$(start)' as Start, '$(time)' as Time autogenerate 1;
unmap *;

let start = now();
t: load *, pick(Status - 1, 'A', 'B', 'C', 'D') as StatusX from Dummy.qvd (qvd);
let noRows = noofrows('t'); let end = now(); let time = time('$(end)' - '$(start)'); drop tables t; trace '$(time)';
c: load $(noRows) as Records, 4 as No, 'pick' as Type, '$(end)' as End, '$(start)' as Start, '$(time)' as Time autogenerate 1;

let start = now();
t: load * from Dummy.qvd (qvd); left join(t) load * resident l;
let noRows = noofrows('t'); let end = now(); let time = time('$(end)' - '$(start)'); drop tables t; trace '$(time)';
c: load $(noRows) as Records, 5 as No, 'join' as Type, '$(end)' as End, '$(start)' as Start, '$(time)' as Time autogenerate 1;

drop table l;

which results in:

marcus_sommer_0-1676384053632.png

whereby the if-loop from the origin example was replaced by a pick() approach and no artificial where-clause is applied for the map using load because it's not needed anymore (maybe it was a bug within the old releases or Qlik changed the processing). To mention is further that the results has some differences between multiple runs of about 10 - 15 seconds to each one of the approaches which means that my system/network-storage has some external impacts. Without measuring  each run I think the showed results is quite the average.

I must admit that I had expected bigger differences between the methods especially for more run-time benefits of the applymap(). Beside this the test is very simplified and with more complex data and requirements the run-time comparing may be quite different.

In my real life I use meanwhile nearly always mappings whereby not seldom multiple ones horizontally and vertically nested (both in lookups as well as by the returns) and they work very well and fast - better as all the other approaches before. 

 

crgrossman
Contributor
Contributor

I would use the apply map for business values that are not standardized that have to managed manually or through spreadsheets, and left join if the business logic is coming from a static source like a database or API. 

The apply map also allows you to handle exceptions in a single line of code that's easy to understand. 

Ally
Contributor
Contributor

Going through the thread, one difference that hasn't been highlighted is that if you have duplicate values in your second table (let's call it table 'B') and have values assigned to those duplicates,

when using left join you will see all those values appended into your original table 'A'.

If you use a mapping load the duplicate values will not show up in table 'A' and only one row will be appended from table 'B' - it will pick the first one it sees and you'll only see that one value in the 'A' table.