Skip to main content
Woohoo! Qlik Community has won “Best in Class Community” in the 2024 Khoros Kudos awards!
Announcements
Nov. 20th, Qlik Insider - Lakehouses: Driving the Future of Data & AI - PICK A SESSION
cancel
Showing results for 
Search instead for 
Did you mean: 
datanibbler
Champion
Champion

Join returns slightly different numbers than a pure link - why?

Hi,

I am working on separating a list of filenames into a part that has to be processed with priority according to an Excel_shortlist and the rest.

The files then have to be renamed. For the prioritary ones, I already know how I have to change the filename, so that will be time-consuming, but quite easy.

The point where I'm a bit confused is this: I have the list of filenames and the shortlist - I have to join these to find out which filenames appear on my shortlist and which don't.

To find out about this, I have made a copy of the keyfield in both of the tables - those fields are thus not linked so I can query them.

When I don't join the tables, but merely link them via that keyfield, I can simply use a set of listboxes to

- select all available values of that copy_of_keyfield in my table B (which will automatically make all the values in table_A selectable

   which match those and exclude all others)

- select all excluded values in table_A => that will return all the values in table_A that do not have a match in table_B.

=> I can count both and the figures add up nicely to the total nr. of files, just like they should (e.g. 6.759 matching and 37.490 not matching)

<=> When I do a JOIN on the same keyfield, however, the numbers change - the matching values (those where the copy_of_keyfield from table_B has a positive length) are more - so that the total nr. of files is not correct anymore. - I now have 7.263 matching and 37.490 not matching which in total is too much.

This baffles me a bit - how can that be?

Stranger still, if I do a RESIDENT LOAD from this with a WHERE clause querying the field >Bezeichnung< for a positive length, the number on the GUI is the same as in the last scenario (with the JOIN), but the table_viewer tells me there are even more records in the table, and I cannot find the others since there is no selection on the GUI. - the table_viewer tells me there are 7.673 records in the table, on the GUI it's 7.263 again.

Can anyone help me shed some light on this? I'll try with a table_chart - and I guess there is no harm in attaching the qvd's I am using so you can hopefully reconstruct my issue and tell what I'm doing wrong.

Thanks a lot!

Best regards,

DataNibbler

1 Solution

Accepted Solutions
marcus_sommer

Hi DataNibbler,

joins are often meticulous with duplication or removing records and you need a very good knowledge about the data-quality in the used tables and if they have a lack of them you will need another strategy. An association from the tables per key is one and another could be to map those data together: Mapping as an Alternative to Joining.

- Marcus

View solution in original post

2 Replies
marcus_sommer

Hi DataNibbler,

joins are often meticulous with duplication or removing records and you need a very good knowledge about the data-quality in the used tables and if they have a lack of them you will need another strategy. An association from the tables per key is one and another could be to map those data together: Mapping as an Alternative to Joining.

- Marcus

datanibbler
Champion
Champion
Author

Hi Marcus,

yes, I just thought of that. Since when I do not join, the total nr. is correct, I think I'll go with that and not worry too much about what I did wrong. I also tried with a mapping LOAD and that returns the exact same and correct numbers. So I'll just continue without joining. It's a one-off thing, anyway, so there's nothing wrong about making a table_chart, exporting it to Excel and loading it again.

Best regards,

DataNibbler