Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Qlik Open Lakehouse is Now Generally Available! Discover the key highlights and partner resources here.
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

Report to Identify Duplicates

I have some duplicates records that I need to identify and report.

The criteria is to check several fields if they have the same values (there are able 8 fields to check, Date, Customer, Procedure Time, Manager Assigned, etc.)

What is the best way to do this? Thank you in advance.

5 Replies
swuehl
MVP
MVP

So you want to check for records that show identical values in all 8 fields?

Maybe like this (example for 4 fields)

    

LOAD recno() as RecID,

          Date,

          Customer,

          [Procedure Time],

          [Manager Assigned],

          Hash256(Date, Customer,[Procedure Time], [Manager Assigned]) as Key

FROM ...;


Then create a list box for field Key and enable frequency option. Also sort by frequency descending.


Create a table box with fields RecID, Date, [Procedure Time] and [Manager Assigned] (add other fields from the table as you like).


Select Keys with a frequency >1.


         

    

Not applicable
Author

Thank you for your reply. What does Hash256 do?

Anonymous
Not applicable
Author

Hi,

Make a key using the same combination of fields using autonumber or such functions and put them in a list box with frequency checked on. Select keys with freq.>1

HTH

jagan
Partner - Champion III
Partner - Champion III

Hi Melissa,

Try like this

Chart : Straight Table

Dimensions: Date, Customer, Procedure Time, Manager Assigned, etc.

Expression: If(Count(Customer) > 1, 'Duplicate', Null())

Hope this helps you.

Regards,

jagan.

swuehl
MVP
MVP

It justs calculates a unique value per combination of arguments to the function (well, with theoretical limitations of uniqueness).

Instead the Hash256() function, you can also use Autonumber() or just a concatenation, like

Customer & '-' & [Procedure Time] &'-'&[Manager Assigned] as Key

What Jagan suggested is also a good solution, without the need to create a Key in the data model, here the key is created by using all the fields as dimensions.