Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi All,
I have a data table such as the below.
Identifier | Date | Class |
Alpha | 05/31/2017 | A |
Alpha | 05/31/2017 | B |
Alpha | 05/31/2017 | C |
Alpha | 05/31/2017 | D |
Alpha | 07/31/2017 | A |
Alpha | 07/31/2017 | B |
Over time Classes will be removed from Identifier & Date combinations. In the above on 5/31/2017 Alpha had A,B,C,D for Classes while on 7/31/2017 it only had A,B. I'd like to create a table in my data model such as the below.
Identifier | Date | Dropped Classes |
Alpha | 07/31/2017 | C,D |
Does anyone have any ideas?
Thanks,
Mark
Try something like this
Table: LOAD *, Hash128(Identifier&'|'&Date&'|'&Class) as Key; LOAD * INLINE [ Identifier, Date, Class Alpha, 05/31/2017, A Alpha, 05/31/2017, B Alpha, 05/31/2017, C Alpha, 05/31/2017, D Alpha, 07/31/2017, A Alpha, 07/31/2017, B ]; TempTable: LOAD Distinct Identifier, Class Resident Table; Left Join (TempTable) LOAD DISTINCT Date Resident Table; DroppedClasses: LOAD Identifier, Date, Concat(Class, ', ') as [Dropped Classes] Resident TempTable Where not Exists(Key, Hash128(Identifier&'|'&Date&'|'&Class)) Group By Identifier, Date; DROP Table TempTable;
Hi Sunny,
This works very well but when I try and expand it to a larger data set it does not quite work as I want. For example if I add new date row
Identifier | Date | Class |
Alpha | 05/31/2017 | A |
Alpha | 05/31/2017 | B |
Alpha | 05/31/2017 | C |
Alpha | 05/31/2017 | D |
Alpha | 07/31/2017 | A |
Alpha | 07/31/2017 | B |
Alpha | 08/31/2017 | A |
Using the script you provided the output of this would be
Identifier | Date | Dropped Classes |
Alpha | 07/31/2017 | C,D |
Alpha | 08/31/2017 | B,C,D |
Where what I am looking for is.
Identifier | Date | Dropped Classes |
Alpha | 07/31/2017 | C,D |
Alpha | 08/31/2017 | B |
The current script looks to the whole set of classes for each date where I'd like to look to the most recent set.
Thanks,
Mark
Try it out with some modifications
Table: LOAD *, Hash128(Identifier&'|'&Date&'|'&Class) as Key; LOAD * INLINE [ Identifier, Date, Class Alpha, 05/31/2017, A Alpha, 05/31/2017, B Alpha, 05/31/2017, C Alpha, 05/31/2017, D Alpha, 07/31/2017, A Alpha, 07/31/2017, B Alpha, 08/31/2017, A ]; TempTable: LOAD Distinct Identifier, Class Resident Table; Left Join (TempTable) LOAD DISTINCT Date Resident Table; TempDroppedClasses: LOAD * Where NOT Exists(New_Class); LOAD Identifier, Date, Class as New_Class Resident TempTable Where not Exists(Key, Hash128(Identifier&'|'&Date&'|'&Class)) Order By Identifier, Class, Date; DroppedClasses: LOAD Identifier, Date, Concat(New_Class, ', ') as [Dropped Classes] Resident TempDroppedClasses Group By Identifier, Date; DROP Table TempTable, TempDroppedClasses;