Skip to main content
Announcements
Join us at Qlik Connect for 3 magical days of learning, networking,and inspiration! REGISTER TODAY and save!
cancel
Showing results for 
Search instead for 
Did you mean: 
acbishop2
Creator
Creator

Choose which duplicate to remove.

Hello All,

I'm connecting two tables: Leads and Apps. When a lead becomes an app, we record the AppId on the Leads table to connect it with the AppId on the App table and do so with a simple JOIN statement. Pretty simple so far, right? My issue stems from the fact that we decided to allow multiple leads to be connected to the same app. This is necessary for our business model, but has caused a problem for me when reporting. Because there may be two or more LeadIds for each AppId, some data is getting duplicated, as in this dummy data.

LeadIdCampaignIdAppIdAppNameAppDate

11

123123100Donald Duck1/1/2018
22100Donald Duck1/1/2018
33123123101Mickey Mouse1/2/2018
44102Minnie Mouse1/3/2018
55555555102Minnie Mouse1/3/2018

The problem with the data is that if I were to count(AppId), I would get 5 instead of the desired 3 (yes, I know I could do count(distinct(AppId)), but that wouldn't solve the actual problem). I was able to remove duplicates, but my script simply gets rid of the second one (or any after the first). Here's how it looks currently:

T1:

Load

     LeadId,

     CampaignId,

     If(Previous(AppId)=AppId,0,AppId) as AppId,

     AppName,

     AppDate

From ...

Order by AppId;

Final:

Load

     LeadId,

     CampaignId,

     AppId,

     AppName,

     AppDate

Resident T1;

Drop Table T1;

Instead of getting rid of all duplicates after the first, I need to be able to keep the one with a value in the CampaignId column. In the table above, I would keep the rows with LeadIds 11, 33, and 55.

I hope I articulated my need well enough. If anything didn't make sense, let me know. And as always, thanks in advance for the help!

Labels (1)
2 Replies
framacdev
Contributor III
Contributor III

Hi Aaron - thanks for sharing your business problem.

The way to resolve your problem is to use the Exists function, very helpful when you want to filter field values that have already been loaded. Here is the link reference: https://help.qlik.com/en-US/qlikview/November2017/Subsystems/Client/Content/Scripting/InterRecordFun...

What you need to to is to load after the "Drop Table T1" statement the following script content:

OnlyCampaignId:

LOAD CampaignId

Resident Final

;


NoConcatenate

LOAD *

Resident Final

Where Exists(CampaignId)

;

DROP TABLES Final, OnlyCampaignId;


I hope the mentioned script resolves your business problem.


Best regards,


Francesco

jonathandienst
Partner - Champion III
Partner - Champion III

When data has differing granularity, it is better to let Qlik handle the association, rather than joining the tables.The tables will associate on AppID.

Logic will get you from a to b. Imagination will take you everywhere. - A Einstein