Skip to main content
Announcements
Join us at Qlik Connect for 3 magical days of learning, networking,and inspiration! REGISTER TODAY and save!
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

Report to Identify Duplicates

I have some duplicates records that I need to identify and report.

The criteria is to check several fields if they have the same values (there are able 8 fields to check, Date, Customer, Procedure Time, Manager Assigned, etc.)

What is the best way to do this? Thank you in advance.

5 Replies
swuehl
MVP
MVP

So you want to check for records that show identical values in all 8 fields?

Maybe like this (example for 4 fields)

    

LOAD recno() as RecID,

          Date,

          Customer,

          [Procedure Time],

          [Manager Assigned],

          Hash256(Date, Customer,[Procedure Time], [Manager Assigned]) as Key

FROM ...;


Then create a list box for field Key and enable frequency option. Also sort by frequency descending.


Create a table box with fields RecID, Date, [Procedure Time] and [Manager Assigned] (add other fields from the table as you like).


Select Keys with a frequency >1.


         

    

Not applicable
Author

Thank you for your reply. What does Hash256 do?

Anonymous
Not applicable
Author

Hi,

Make a key using the same combination of fields using autonumber or such functions and put them in a list box with frequency checked on. Select keys with freq.>1

HTH

jagan
Luminary Alumni
Luminary Alumni

Hi Melissa,

Try like this

Chart : Straight Table

Dimensions: Date, Customer, Procedure Time, Manager Assigned, etc.

Expression: If(Count(Customer) > 1, 'Duplicate', Null())

Hope this helps you.

Regards,

jagan.

swuehl
MVP
MVP

It justs calculates a unique value per combination of arguments to the function (well, with theoretical limitations of uniqueness).

Instead the Hash256() function, you can also use Autonumber() or just a concatenation, like

Customer & '-' & [Procedure Time] &'-'&[Manager Assigned] as Key

What Jagan suggested is also a good solution, without the need to create a Key in the data model, here the key is created by using all the fields as dimensions.