Duplicate Flag Indicator Creation - Qlik Community

Report Inappropriate Content · ‎2014-01-27

Dear all,

I'm fairly new to Qlikview so apologies if this seems a simple question:

Below is some Data (Just 2 rows for an example) , the last column being my current result from the code below

PolicyNumber	ReportDate	PolicyVersionNumber	DuplicateKey	Result Currently for RemoveDuplicate
AB123	01/03/2013	3	AB123 01/03/2013	1
AB123	01/03/2013	4	AB123 01/03/2013

This code identifies the minimum PolicyVersionNumber by the DuplicateKey field and then creates a table containing

a RemoveDuplicate flag with value of 1. In reality There are thousands of rows of data and the small data set only contains about

60 rows of data.

However I have been trying to modify the code so the small table contains a "0" to indicate it is not a duplicate. Meaning there would 100% match between the subset table and the main policy table when matching on the PolicyKey

Note: Policykey is just the PolicyNumber and the PolicyVersion Number.

Code so far:

DUPLICATES:

LOAD

DuplicateKey,

PolicyNumber & ' ' & MIN(PolicyVersionNumber) as PolicyKey,

1 AS RemoveDuplicate

RESIDENT POLICIES

GROUP BY DuplicateKey, PolicyNumber;

/* Count the number of duplicates by the DuplicateKey */

INNER JOIN (DUPLICATES)

LOAD *

WHERE DuplicateCount>1;

LOAD

DuplicateKey,

count(DuplicateKey) as DuplicateCount

RESIDENT POLICIES

WHERE MATCH(PolicyVersionStatusId,3,5,6)

GROUP BY DuplicateKey;

//*** Drop Temporary fields

DROP FIELDS DuplicateKey, DuplicateCount;

Required Result:

PolicyNumber	ReportDate	PolicyVersionNumber	DuplicateKey	Result Required for RemoveDuplicate
AB123	01/03/2013	3	AB123 01/03/2013	1
AB123	01/03/2013	4	AB123 01/03/2013	0

Nicole-Smith · ‎2014-01-27

What about something like the following (both things that I've added/changed are commented with ***NS):

DUPLICATES:

LOAD

    DuplicateKey,

    PolicyNumber & ' ' & MIN(PolicyVersionNumber) as PolicyKey,

    1 AS RemoveDuplicateTemp //***NS - renamed this to RemoveDuplicateTemp

RESIDENT POLICIES

GROUP BY DuplicateKey, PolicyNumber;

/* Count the number of duplicates by the DuplicateKey */

INNER JOIN (DUPLICATES)

LOAD *

WHERE DuplicateCount>1;

LOAD

DuplicateKey,

count(DuplicateKey) as DuplicateCount

RESIDENT POLICIES

WHERE MATCH(PolicyVersionStatusId,3,5,6)

GROUP BY DuplicateKey;

//*** Drop Temporary fields

DROP FIELDS DuplicateKey, DuplicateCount;

///***NS - Add this to replace anything that isn't a one with a zero, then get rid of original field

LEFT JOIN (DUPLICATES)

LOAD *,

     if(RemoveDuplicateTemp=1, 1, 0) as RemoveDuplicate

RESIDENT DUPLICATES;

DROP FIELD RemoveDuplicateTemp;

Report Inappropriate Content · ‎2014-01-27

This doesn't work as the duplicates table you are creating the resident from only contains the 60 rows;

I have manged to do it by creating another step , but it is not as slick as I think it could be!

I create a mapping table then -map this on , and where there is no match assign a 0. It works but having to create a another table and drop my original policies table seems like over processing/ bigger refresh times.

DTEMP:

LOAD

DuplicateKey,

PolicyNumber & ' ' & MIN(PolicyVersionNumber) as PolicyKey,

1 AS RemoveDuplicate

RESIDENT POLICIES

GROUP BY DuplicateKey, PolicyNumber;

/* Count the number of duplicates by the DuplicateKey */

INNER JOIN (DTEMP)

LOAD *

WHERE DuplicateCount>1;

LOAD

DuplicateKey,

count(DuplicateKey) as DuplicateCount

RESIDENT POLICIES

WHERE MATCH(PolicyVersionStatusId,3,5,6)

GROUP BY DuplicateKey;

DUPLICATES:

MAPPING LOAD

PolicyKey,

RemoveDuplicate

RESIDENT DTEMP;

//*** Drop Temporary fields

DROP FIELDS DuplicateKey, DuplicateCount;

DROP TABLE DTEMP;

FACTS:

LOAD

*,

APPLYMAP('DUPLICATES',PolicyKey,0) AS RemoveDuplicate

RESIDENT POLICIES;

DROP TABLE POLICIES;