Skip to main content
Announcements
Join us at Qlik Connect for 3 magical days of learning, networking,and inspiration! REGISTER TODAY and save!
cancel
Showing results for 
Search instead for 
Did you mean: 
m_perreault
Creator III
Creator III

Identifying Missing Values by Date

Hi All,


I have a data table such as the below.

 

IdentifierDateClass
Alpha05/31/2017A
Alpha05/31/2017B
Alpha05/31/2017C
Alpha05/31/2017D
Alpha07/31/2017A
Alpha07/31/2017B

 

Over time Classes will be removed from Identifier & Date combinations.  In the above on 5/31/2017 Alpha had A,B,C,D for Classes while on 7/31/2017 it only had A,B.  I'd like to create a table in my data model such as the below.

IdentifierDateDropped Classes
Alpha07/31/2017C,D

 

Does anyone have any ideas?

 

Thanks,
Mark

Labels (2)
3 Replies
sunny_talwar

Try something like this

Table:
LOAD *,
	 Hash128(Identifier&'|'&Date&'|'&Class) as Key;
LOAD * INLINE [
    Identifier, Date, Class
    Alpha, 05/31/2017, A
    Alpha, 05/31/2017, B
    Alpha, 05/31/2017, C
    Alpha, 05/31/2017, D
    Alpha, 07/31/2017, A
    Alpha, 07/31/2017, B
];

TempTable:
LOAD Distinct Identifier,
	 Class
Resident Table;

Left Join (TempTable)
LOAD DISTINCT Date
Resident Table;

DroppedClasses:
LOAD Identifier,
	 Date,
	 Concat(Class, ', ') as [Dropped Classes]
Resident TempTable
Where not Exists(Key, Hash128(Identifier&'|'&Date&'|'&Class))
Group By Identifier, Date;

DROP Table TempTable;
m_perreault
Creator III
Creator III
Author

Hi Sunny,

 

This works very well but when I try and expand it to a larger data set it does not quite work as I want.  For example if I add new date row

IdentifierDateClass
Alpha05/31/2017A
Alpha05/31/2017B
Alpha05/31/2017C
Alpha05/31/2017D
Alpha07/31/2017A
Alpha07/31/2017B
Alpha08/31/2017A

 

Using the script you provided the output of this would be

IdentifierDateDropped Classes
Alpha07/31/2017C,D
Alpha08/31/2017B,C,D

 

Where what I am looking for is.

IdentifierDateDropped Classes
Alpha07/31/2017C,D
Alpha08/31/2017B

 

The current script looks to the whole set of classes for each date where I'd like to look to the most recent set.


Thanks,
Mark

sunny_talwar

Try it out with some modifications

Table:
LOAD *,
	 Hash128(Identifier&'|'&Date&'|'&Class) as Key;
LOAD * INLINE [
    Identifier, Date, Class
    Alpha, 05/31/2017, A
    Alpha, 05/31/2017, B
    Alpha, 05/31/2017, C
    Alpha, 05/31/2017, D
    Alpha, 07/31/2017, A
    Alpha, 07/31/2017, B
    Alpha, 08/31/2017, A
];

TempTable:
LOAD Distinct Identifier,
	 Class
Resident Table;

Left Join (TempTable)
LOAD DISTINCT Date
Resident Table;

TempDroppedClasses:
LOAD *
Where NOT Exists(New_Class);
LOAD Identifier,
	 Date,
	 Class as New_Class
Resident TempTable
Where not Exists(Key, Hash128(Identifier&'|'&Date&'|'&Class))
Order By Identifier, Class, Date;

DroppedClasses:
LOAD Identifier,
	 Date,
	 Concat(New_Class, ', ') as [Dropped Classes]
Resident TempDroppedClasses
Group By Identifier, Date;

DROP Table TempTable, TempDroppedClasses;