Skip to main content
Announcements
Qlik Connect 2024! Seize endless possibilities! LEARN MORE
cancel
Showing results for 
Search instead for 
Did you mean: 
bickyqlik
Contributor II
Contributor II

Inner join resulting cross multiply records

I am doing a inner join for multiple tables.

One table is having 21000 records and second table is having 22700 records

Both tables have one common column name. While doing inner join the resulting table is having 599k records. Ideally it should have <=22700 records but in reality it's cross multiplying. don't know where is the issue.

inner join([<table>])

Load

           fields..

from <table1>

inner join([<table>])

Load

           fields..

from <table2>

inner join([<table>])

Load

           fields..

from <table3>

Labels (2)
13 Replies
Or
MVP
MVP

Since you didn't include the actual code, it's hard to guess where the issue is. I do note that you have three tables, not two, but your original scenario only describes two tables. Perhaps the third table is the issue? Alternatively, perhaps the tables have a relationship that results in multiple instances of the key field value, e.g.

Table1:
Load * INline [
Field1, Field2
A, 1
A, 2
B, 3
C, 4];


INNER JOIN
Load * INLINE [
Field1, Field3
A, 10
A, 11
A, 12
B, 20
C, 30];

This will result in 6 lines for Field1 = A, one for each combination of Field2 and Field3.

lukas_
Contributor III
Contributor III

Hello,

As a general rule, when a join does not work correctly, this is due to the name of the field which is not strictly the same or when the join is done on several fields without that being wanted. To answer more precisely, it will take the contents of the script.
As Or indicates, you also have to take into account the cardinality of your tables and the "duplicates" row that there may be.

bickyqlik
Contributor II
Contributor II
Author

My code. Please advise

[REPORT]:
LOAD
plan,
plan_name,
plan_desc,
plan_location
FROM [plan.xlsx]
(ooxml, embedded labels, table is [plan_detail])
where Match(plan,204243, 15138, 204850, 15572, 15589, 201403, 204829, 210395);

inner Join ([REPORT])
LOAD
plan,
plan_title,
plan_staff,
plann_detail
FROM [plan.xlsx]
(ooxml, embedded labels, table is [plan_detail])

inner Join ([REPORT])
LOAD
plan,
plan_emp,
plan_emp_loc,
plan_emp_code,
plan_emp_age
FROM [plan_worker.xlsx]
(ooxml, embedded labels, table is Workers);

bickyqlik
Contributor II
Contributor II
Author

Am I doing right way or missing something here. If I read all tables separately, getting right number of records but when joining getting a very high number of records

 

Or
MVP
MVP

If I had to guess, you have one line per plan in the first table, multiple lines per plan in the second table, and multiple lines per plan in the third table. Since you are joining both the second and third tables together with the first, this results in a many to many relationship. Of course, that's just a guess based on the field names and without access to the underlying data.

lukas_
Contributor III
Contributor III

To be sure of the cardinality of your tables, you should load them without a join (thanks to a qualify or by loading only one) and look in the Data model viewer if your tables have duplicate key values.

bickyqlik
Contributor II
Contributor II
Author

Yes. You are right

bickyqlik
Contributor II
Contributor II
Author

Table has duplicate key value but the following values are different and making it unique record. Now it's becoming many-to-many relationship and so leading to a larger set of records. I can see them when I do a concatenate rather than a join. So to avoid many-to-many relationship, what is the best solution here?

Or
MVP
MVP

In this scenario, typically what you want to do is simply Keep instead of Join. Unlike SQL, you don't have to merge everything into one "query" here.