Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 
simonwithey
Contributor
Contributor

identifying Duplicates

As a Qlik virgin I need the help of you experts. I need to identify and set a flag against a duplicate record in a load script. Example ogf the table below. I need to check if the Id's are the same and if so i need to set a flag against the oldest date only and set the later date as OK. If the dates are equal i just need to set the first record as a duplicate. many thanks

  

IdAuth Date
C001833719-Jul-17
C001833727-Jul-17
CX21179502-Jun-17
CX21179502-Jun-17
CX21219808-Jun-17
CX21219809-Jun-17
N000248808-Jun-17
N000248814-Jun-17
P000074012-Jun-17
P000074016-Jun-17
R000024906-Jul-17
R000024925-Jul-17
19 Replies
Miguel_Angel_Baeyens

Where are you loading the data from: database, file?

Simple example for database:

CONNECT TO [Database]; // Use the button "Connect..." in the script editor for the wizard

Table:

LOAD *;

// Put your sql statement here to retrieve the rows

SQL SELECT *

FROM Table;

Simple example for file:

Table:

LOAD *

FROM File.qvd (Qvd); // type of file and options can be different, use "Table Files" button in the script editor

simonwithey
Contributor
Contributor
Author

The file is already loaded (see example below). It's saying that 'Table' doesn't exist. If you could fill in where and what I should be entering to make it work I'd be very grateful. Many thanks in advance.

[REP]:

LOAD

    "Assessment No",

    "Creator Name",

    "Vehicle Manufacturer Name",

    "Vehicle Model Name",

    "Creator Site Id",

    "Creator Terminal Id",

    "Vehicle Reg",

    "First Auth Date",

FROM [lib://VIA Motor Drive (Perf & Planning)/Motor Repair\MOTOR REPAIR 2017.xlsx]

(ooxml, embedded labels, table is REP);

Table:

Load

"Assessment No",

"First Auth Date";

FinalTable:

Load

"Assessment No",

"First Auth Date",

If("Assessment No"=Previous("Assessment No"),'DUP','')as [DUP Flag]

Resident Table

Order By "Assessment No",[First Auth Date]desc;

Drop Table Table;

sunny_talwar

You will resident from REP instead of Table

[REP]:

LOAD

    "Assessment No",

    "Creator Name",

    "Vehicle Manufacturer Name",

    "Vehicle Model Name",

    "Creator Site Id",

    "Creator Terminal Id",

    "Vehicle Reg",

    "First Auth Date",

FROM [lib://VIA Motor Drive (Perf & Planning)/Motor Repair\MOTOR REPAIR 2017.xlsx]

(ooxml, embedded labels, table is REP);

Final_REP:

Load *,

    If("Assessment No"=Previous("Assessment No"),'DUP','')as [DUP Flag]

Resident REP

Order By "Assessment No", [First Auth Date] desc;

Drop Table REP;

simonwithey
Contributor
Contributor
Author

Cheers Sunny, looks like I'm nearly there. I just need to do some tweaking as to whereabouts I put the code as I have a lot more to identify with other criteria. But many many thanks for your help and patience.

sunny_talwar

No problem Simon, one of us will always be there to help you out.

Best,

Sunny

passionate
Specialist
Specialist

Hi Simon,

PFA, Solution.

Regards,

Pankaj

Miguel_Angel_Baeyens

Another way is loading your table building a key field or flag field with all what identifies a duplicate, and on a preceding load, use the Exists() function to search for it and load only relevant values:

Table:

LOAD

  *

WHERE NOT Exists(Key)

  ;

LOAD

  *

  ,Field1 & '|' & Field2 & '|' & Field 3 AS Key

FROM File;

EDIT: Example attached for perusal

upaliwije
Creator II
Creator II

Dear Sunny

I am also interested in your script of finding duplicates.

  If("Assessment No"=Previous("Assessment No"),'DUP','')as [DUP Flag]


if I am not mistaken Pls educate me what does  ,'' part do in script (shown in red colour above)

Thanks

Anil_Babu_Samineni

May be sunny sleeping as he located in USA. Why don't you test in your end rather expecting from community

Here, The part you are expecting works as

Sample Data

FieldName

1

2

3

3

If(FieldName = Previous(FieldName), 'DUP', ' ') as Flag

Output will come like below

FieldName, Flag

1, Space // Because this is not the part the your if condition fulfilled.

2, Space

3, Space

3, DUP

Rather, You can test for your case

If(FieldName = Previous(FieldName), 'DUP', 'Not DUP') as Flag

Best Anil, When applicable please mark the correct/appropriate replies as "solution" (you can mark up to 3 "solutions". Please LIKE threads if the provided solution is helpful
upaliwije
Creator II
Creator II

Thanks