Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Qlik Open Lakehouse is Now Generally Available! Discover the key highlights and partner resources here.
cancel
Showing results for 
Search instead for 
Did you mean: 
m_perreault
Creator III
Creator III

Identifying Missing Values by Date

Hi All,


I have a data table such as the below.

 

IdentifierDateClass
Alpha05/31/2017A
Alpha05/31/2017B
Alpha05/31/2017C
Alpha05/31/2017D
Alpha07/31/2017A
Alpha07/31/2017B

 

Over time Classes will be removed from Identifier & Date combinations.  In the above on 5/31/2017 Alpha had A,B,C,D for Classes while on 7/31/2017 it only had A,B.  I'd like to create a table in my data model such as the below.

IdentifierDateDropped Classes
Alpha07/31/2017C,D

 

Does anyone have any ideas?

 

Thanks,
Mark

Labels (2)
3 Replies
sunny_talwar

Try something like this

Table:
LOAD *,
	 Hash128(Identifier&'|'&Date&'|'&Class) as Key;
LOAD * INLINE [
    Identifier, Date, Class
    Alpha, 05/31/2017, A
    Alpha, 05/31/2017, B
    Alpha, 05/31/2017, C
    Alpha, 05/31/2017, D
    Alpha, 07/31/2017, A
    Alpha, 07/31/2017, B
];

TempTable:
LOAD Distinct Identifier,
	 Class
Resident Table;

Left Join (TempTable)
LOAD DISTINCT Date
Resident Table;

DroppedClasses:
LOAD Identifier,
	 Date,
	 Concat(Class, ', ') as [Dropped Classes]
Resident TempTable
Where not Exists(Key, Hash128(Identifier&'|'&Date&'|'&Class))
Group By Identifier, Date;

DROP Table TempTable;
m_perreault
Creator III
Creator III
Author

Hi Sunny,

 

This works very well but when I try and expand it to a larger data set it does not quite work as I want.  For example if I add new date row

IdentifierDateClass
Alpha05/31/2017A
Alpha05/31/2017B
Alpha05/31/2017C
Alpha05/31/2017D
Alpha07/31/2017A
Alpha07/31/2017B
Alpha08/31/2017A

 

Using the script you provided the output of this would be

IdentifierDateDropped Classes
Alpha07/31/2017C,D
Alpha08/31/2017B,C,D

 

Where what I am looking for is.

IdentifierDateDropped Classes
Alpha07/31/2017C,D
Alpha08/31/2017B

 

The current script looks to the whole set of classes for each date where I'd like to look to the most recent set.


Thanks,
Mark

sunny_talwar

Try it out with some modifications

Table:
LOAD *,
	 Hash128(Identifier&'|'&Date&'|'&Class) as Key;
LOAD * INLINE [
    Identifier, Date, Class
    Alpha, 05/31/2017, A
    Alpha, 05/31/2017, B
    Alpha, 05/31/2017, C
    Alpha, 05/31/2017, D
    Alpha, 07/31/2017, A
    Alpha, 07/31/2017, B
    Alpha, 08/31/2017, A
];

TempTable:
LOAD Distinct Identifier,
	 Class
Resident Table;

Left Join (TempTable)
LOAD DISTINCT Date
Resident Table;

TempDroppedClasses:
LOAD *
Where NOT Exists(New_Class);
LOAD Identifier,
	 Date,
	 Class as New_Class
Resident TempTable
Where not Exists(Key, Hash128(Identifier&'|'&Date&'|'&Class))
Order By Identifier, Class, Date;

DroppedClasses:
LOAD Identifier,
	 Date,
	 Concat(New_Class, ', ') as [Dropped Classes]
Resident TempDroppedClasses
Group By Identifier, Date;

DROP Table TempTable, TempDroppedClasses;