Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Qlik Open Lakehouse is Now Generally Available! Discover the key highlights and partner resources here.
cancel
Showing results for 
Search instead for 
Did you mean: 
qlikuser233
Contributor II
Contributor II

Join data duplicating records

I am joining two data sources, the issue is that I am brining in certain columns for each and they share 4 columns with the primary key being unique identifier. The columns that are missing from Tracker 2 are populating as 0's instead of merging into one line.

Tracker: 
LOAD  Client, 
     [Client ID], 
     [Team Cd], 
     [Policy/Contract Effective Date], 
     [Policy/Contract Expiration Date], 
     [Pol/Cont Num], 
     [Captive or Non-Captive],
     [Policy/Contract Effective Date Conversion], 
     Adjustment,
     [Client ID]&[Pol/Cont Num]&[Policy/Contract Effective Date] as [Unique Identifier], 
[3133]
     [3137]
 
Outer join (Tracker)
 
Tracker_2:
LOAD A as [Client ID], 
     B as [Client], 
   C as [Pol/Cont Num]
     D as [Policy/Contract Effective Date],
     I as [Unique Identifier], 
Z as [Timing - Waiting for Registration (New/RNW or Adjustment) - Premium - COU, PAU, ACR],

Attached is a photo for one record. It splits into 2 lines, the top line is coming from tracker 2 as it has 0's where it should be filling in the information from Tracker. I have crossed out the unique identifier number but they are identical so I cannot figure out why this is creating a line for each source rather than merging them when the primary key for both sources is identical. 

Labels (1)
12 Replies
qlikuser233
Contributor II
Contributor II
Author

I can see that in the table it is taking my unique ID and breaking it down into both sources so I have a line for each. The first row contains data for a column not in Tracker 2 and vice versa for Tracker. So why is it breaking the data sources out instead of merging them? @marcus_sommer 

marcus_sommer

Like @Or mentioned a join is performed against all key-fields which means that all field-values of them must be matching.

Further missing keys and/or any duplicates will have an impact on the resulting data-set and may adding respectively removing records (depending on the applied join-type). To get rid of it could be quite difficult especially if there is not enough knowledge and control in regard to the data-quality. Therefore I suggest to change the join-logic to a mapping which is often easier, more flexible and performant and without any risk to change the number of records:

Don't join - use Applymap instead - Qlik Community - 1467592

qlikuser233
Contributor II
Contributor II
Author

I have figured out the issue, there was a same column name from both sources one was a calculated field and the other a hard coded input that was different once I removed there was no issue. Thanks everyone