Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

Identifying duplicate records using the script

Hi

I want to identify duplicate records.  I've used the Peek and Previous function but not getting the desired result.

I have a table below with field Unique ID.  I want to create another field showing which values are duplicates. ( 1 if there is a duplicate) - see 2nd table below.  How do I do do this?

Unique ID
A1
A2
A3
A4
A5
A6
A7
A1
A5
A6
A8
A9

Unique IDDuplicate
A11
A2
A3
A4
A51
A61
A7
A11
A51
A61
A8
A9

Kind regards

Nayan

1 Solution

Accepted Solutions
tamilarasu
Champion
Champion

Hi Nayan,

Try this,

Data:

Load * inline [

UniqueID

A1

A2

A3

A4

A5

A6

A7

A1

A5

A6

A8

A9

];

Left Join (Data)

Load UniqueID,

If(Count(UniqueID)>1,1) as Count

Resident Data

Group by UniqueID;

View solution in original post

7 Replies
kaushiknsolanki
Partner Ambassador/MVP
Partner Ambassador/MVP

Try this,

Sample:

Load * inline [

UniqueID

A1

A2

A3

A4

A5

A6

A7

A1

A5

A6

A8

A9

];


Temp:

Load * where Count>=2;

Load UniqueID as UID,Count(UniqueID) as Count

Resident Sample

Group by UniqueID;


Final:

Load *,If(Exists(UID,UniqueID),1,0) as DuplicateFlag

Resident Sample;


Drop table Sample, Temp;


Regards,

Kaushik Solanki

Please remember to hit the 'Like' button and for helpful answers and resolutions, click on the 'Accept As Solution' button. Cheers!
tamilarasu
Champion
Champion

Hi Nayan,

Try this,

Data:

Load * inline [

UniqueID

A1

A2

A3

A4

A5

A6

A7

A1

A5

A6

A8

A9

];

Left Join (Data)

Load UniqueID,

If(Count(UniqueID)>1,1) as Count

Resident Data

Group by UniqueID;

devarasu07
Master II
Master II

Hi,

try below method

Data:

LOAD * INLINE [

    Unique ID

    A1

    A2

    A3

    A4

    A5

    A6

    A7

    A1

    A5

    A6

    A8

    A9

];

left join (Data)

load distinct [Unique ID],if(count([Unique ID])>1,1,0)  as Flag Resident Data group by [Unique ID];

Capture.JPG

Anonymous
Not applicable
Author

Hi Kaushik

I've applied your script on a test model and it worked.  Some how its not working with the model I'm busy with.  My unique ID is a concatenation of many fields.  Let me drill down to the detail of each field.

Kind regards

Nayan

kaushiknsolanki
Partner Ambassador/MVP
Partner Ambassador/MVP

Sure.

Regards,

Kaushik Solanki

Please remember to hit the 'Like' button and for helpful answers and resolutions, click on the 'Accept As Solution' button. Cheers!
Clever_Anjos
Employee
Employee

Another approach:

Base:

LOAD [Unique ID],

IF(Exists([Unique ID]),1) as Flag

FROM [https://community.qlik.com/thread/288848](html, codepage is 1252, embedded labels, table is @1);

m:

mapping Load

[Unique ID],

sum(Flag) as Flag

Resident Base

Group By [Unique ID];

Final:

Load

[Unique ID],

ApplyMap('m',[Unique ID]) as Duplicate

Resident Base;

Drop Table Base;

Anonymous
Not applicable
Author

Thank you all.  All solutions given has worked.

Kind regards

Nayan