Solved: Re: Identifying Duplicate values and result - Qlik Community

dhanu_today · ‎2021-01-08

Hi Everyone,

I have data like below and output need show as below.

I have to show the EDIT as 1 for the ID’s which has more than 1 resolved.

e.g. A101 has 2 resolved so at 22/12/2020 (i.e. Max resolved) need to show Edit as 1 and QA as 0 and Info as 0

A103 has 2 resolved so at 29/12/2020 (i.e. Max resolved) need to show Edit as 1 and QA as 0 and Info as 0

Data:

load * Inline
[
ID, Opendate, Resolved, QA, Info
A101, 01/12/2020, 03/12/2020, 1, 2
A102, 01/12/2020, 04/12/2020, 1, 3
A101, 01/12/2020, 22/12/2020, 0, 0
A103, 23/12/2020, 25/12/2020, 1, 5
A103, 23/12/2020, 29/12/2020, 0, 0
A104, 24/12/2020, 25/12/2020, 2 4

];

Output:

ID	Opendate	Resolved	QA	Info	Edit
A101	01/12/2020	03/12/2020	1	2	0
A102	01/12/2020	04/12/2020	1	3	0
A101	01/12/2020	22/12/2020	0	0	1
A103	23/12/2020	25/12/2020	1	5	0
A103	23/12/2020	29/12/2020	0	0	1
A104	24/12/2020	25/12/2020	2	4	0

Thanks,

Dhanu

vhayward · ‎2021-01-20

Hej,

I have created this script which guides you through all the steps to finally get the output as desired in your requirements. In the same time you can easily adjust the scripts when the requirements are changing.

So the emphasis is not do the solution in the least amount of lines possible, but providing a solution that can be read and understood easily by any developer from unexperienced to very experienced.

// Starting Data Set
TMP_DataSet:
LOAD * INLINE [
ID, Opendate, Resolved, QA, Info
A101, 01/12/2020, 03/12/2020, 1, 2
A102, 01/12/2020, 04/12/2020, 1, 3
A101, 01/12/2020, 22/12/2020, 1, 2
A103, 23/12/2020, 25/12/2020, 1, 5
A103, 23/12/2020, 29/12/2020, 1, 5
A104, 24/12/2020, 25/12/2020, 2 ,4
];

// Determine the max. resolved per unique ID.
MAP_MaxResolved:
MAPPING LOAD DISTINCT
[ID] AS [ID]
, MAXSTRING([Resolved]) AS [Max Resolved]
RESIDENT TMP_DataSet
GROUP BY [ID];

// Determine the amount of instances per unique ID
MAP_CountResolved:
MAPPING LOAD DISTINCT
[ID] AS [ID]
, COUNT([Resolved]) AS [Count Resolved]
RESIDENT TMP_DataSet
GROUP BY [ID];

// Add the max resolved and the count resolved to the data set
ENRICHED_DataSet:
NOCONCATENATE LOAD
[ID] AS [ID]
, [Opendate] AS [Opendate]
, [Resolved] AS [Resolved]
, [QA] AS [QA]
, [Info] AS [Info]
, APPLYMAP('MAP_MaxResolved', [ID], NULL()) AS [Max Resolved]
, APPLYMAP('MAP_CountResolved', [ID], 1) AS [Count Resolved]
RESIDENT TMP_DataSet;
DROP TABLE TMP_DataSet;

// Set the requirements to get the proper values at the QA, Info and Edit fields.
FINAL_DataSet:
NOCONCATENATE LOAD
[ID] AS [ID]
, [Opendate] AS [Opendate]
, [Resolved] AS [Resolved]
, IF([Count Resolved] > 1 AND [Resolved] = [Max Resolved], 0, [QA]) AS [QA]
, IF([Count Resolved] > 1 AND [Resolved] = [Max Resolved], 0, [Info]) AS [Info]
, IF([Count Resolved] > 1 AND [Resolved] = [Max Resolved], 1, 0) AS [Edit]
RESIDENT ENRICHED_DataSet;
DROP TABLE ENRICHED_DataSet;

Output:

Best regards,
Vincent Hayward

View solution in original post

dhanu_today · ‎2021-01-08

Updated:

Hi Everyone,

I have data like below and output need show as below.

I have to show the EDIT as 1 for the ID’s which has more than 1 resolved.

e.g. A101 has 2 resolved so at 22/12/2020 (i.e. Max resolved) need to show Edit as 1 and QA as 0 and Info as 0

A103 has 2 resolved so at 29/12/2020 (i.e. Max resolved) need to show Edit as 1 and QA as 0 and Info as 0

Data:

load * Inline
[
ID, Opendate, Resolved, QA, Info
A101, 01/12/2020, 03/12/2020, 1, 2
A102, 01/12/2020, 04/12/2020, 1, 3
A101, 01/12/2020, 22/12/2020, 1, 2
A103, 23/12/2020, 25/12/2020, 1, 5
A103, 23/12/2020, 29/12/2020, 1, 5
A104, 24/12/2020, 25/12/2020, 2 ,4

];

Output:

ID	Opendate	Resolved	QA	Info	Edit
A101	01/12/2020	03/12/2020	1	2	0
A102	01/12/2020	04/12/2020	1	3	0
A101	01/12/2020	22/12/2020	0	0	1
A103	23/12/2020	25/12/2020	1	5	0
A103	23/12/2020	29/12/2020	0	0	1
A104	24/12/2020	25/12/2020	2	4	0

Thanks,

Dhanu

Ksrinivasan · ‎2021-01-08

Hi,

In edit coloumn use count if function. Like count if I'd>1, then return 1 or 0.

Then you have to recreate QA and Info coloum with if condition. That is if edit coloumn is =1, return 0 or return exist value.

Regards,

Ksrinivasan

dhanu_today · ‎2021-01-20

Hi,

can someone provide some clue on this for solution.

TIA

Dhanu

Ksrinivasan · ‎2021-01-20

hi,

find the script it will fulfill your need,

Data:
load
*,
if(AutoNumber(Rowno(),ID)='2','1','0') as EDIT;
load
* Inline
[
ID, Opendate, Resolved, QA, Info,
A101, 01/12/2020, 03/12/2020, 1, 2
A102, 01/12/2020, 04/12/2020, 1, 3
A101, 01/12/2020, 22/12/2020, 1, 2
A103, 23/12/2020, 25/12/2020, 1, 5
A103, 23/12/2020, 29/12/2020, 1, 5
A104, 24/12/2020, 25/12/2020, 2 ,4

];

result:

ksrinivasan

vhayward · ‎2021-01-20

Hej,

I have created this script which guides you through all the steps to finally get the output as desired in your requirements. In the same time you can easily adjust the scripts when the requirements are changing.

So the emphasis is not do the solution in the least amount of lines possible, but providing a solution that can be read and understood easily by any developer from unexperienced to very experienced.

// Starting Data Set
TMP_DataSet:
LOAD * INLINE [
ID, Opendate, Resolved, QA, Info
A101, 01/12/2020, 03/12/2020, 1, 2
A102, 01/12/2020, 04/12/2020, 1, 3
A101, 01/12/2020, 22/12/2020, 1, 2
A103, 23/12/2020, 25/12/2020, 1, 5
A103, 23/12/2020, 29/12/2020, 1, 5
A104, 24/12/2020, 25/12/2020, 2 ,4
];

// Determine the max. resolved per unique ID.
MAP_MaxResolved:
MAPPING LOAD DISTINCT
[ID] AS [ID]
, MAXSTRING([Resolved]) AS [Max Resolved]
RESIDENT TMP_DataSet
GROUP BY [ID];

// Determine the amount of instances per unique ID
MAP_CountResolved:
MAPPING LOAD DISTINCT
[ID] AS [ID]
, COUNT([Resolved]) AS [Count Resolved]
RESIDENT TMP_DataSet
GROUP BY [ID];

// Add the max resolved and the count resolved to the data set
ENRICHED_DataSet:
NOCONCATENATE LOAD
[ID] AS [ID]
, [Opendate] AS [Opendate]
, [Resolved] AS [Resolved]
, [QA] AS [QA]
, [Info] AS [Info]
, APPLYMAP('MAP_MaxResolved', [ID], NULL()) AS [Max Resolved]
, APPLYMAP('MAP_CountResolved', [ID], 1) AS [Count Resolved]
RESIDENT TMP_DataSet;
DROP TABLE TMP_DataSet;

// Set the requirements to get the proper values at the QA, Info and Edit fields.
FINAL_DataSet:
NOCONCATENATE LOAD
[ID] AS [ID]
, [Opendate] AS [Opendate]
, [Resolved] AS [Resolved]
, IF([Count Resolved] > 1 AND [Resolved] = [Max Resolved], 0, [QA]) AS [QA]
, IF([Count Resolved] > 1 AND [Resolved] = [Max Resolved], 0, [Info]) AS [Info]
, IF([Count Resolved] > 1 AND [Resolved] = [Max Resolved], 1, 0) AS [Edit]
RESIDENT ENRICHED_DataSet;
DROP TABLE ENRICHED_DataSet;

Output:

Best regards,
Vincent Hayward

Saravanan_Desingh · ‎2021-01-20

One solution is.

tab1:
LOAD * INLINE [
ID, Opendate, Resolved, QA, Info
A101, 01/12/2020, 03/12/2020, 1, 2
A102, 01/12/2020, 04/12/2020, 1, 3
A101, 01/12/2020, 22/12/2020, 1, 2
A103, 23/12/2020, 25/12/2020, 1, 5
A103, 23/12/2020, 29/12/2020, 1, 5
A104, 24/12/2020, 25/12/2020, 2 ,4
];

Left Join(tab1)
LOAD ID, Date(Max(Date#(Resolved,'DD/MM/YYYY'))) As MaxDt,
	Count(ID) As Cnt
Resident tab1
Group By ID;

tab2:
LOAD ID, Opendate, Resolved, If(Cnt>1 And Date#(Resolved,'DD/MM/YYYY')=MaxDt,0,QA) As QA,
	If(Cnt>1 And Date#(Resolved,'DD/MM/YYYY')=MaxDt,0,Info) As Info,
	If(Cnt>1 And Date#(Resolved,'DD/MM/YYYY')=MaxDt,1,0) As Edit
Resident tab1;

Drop Table tab1;

Saravanan_Desingh · ‎2021-01-20

Output: