Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hey guys. I have a tricky problem I'm trying to figure out.
I have a set of spreadsheets, one per month, that contain data from essentially the same source, so the same record can exist in multiple spreadsheets, but they go away after they're completed. These records have a numeric value that can change month to month. I want to create an aggregate table that contains all records from all data sets, but only the most recent numeric value for which that record occurs.
There's a trick though. I was hoping to use "Where Not Exists()", but I realized I have multiple fields I need to key off of. So each record has multiple event types, and I need to take the most recent value for each record for each event type.
So here's an example. Say I have these three data sets:
Jan:
LOAD * Inline
[Record, EventType, Value
A, Foo, 5
A, Bar, 3
B, Foo, 2];
Feb:
LOAD * Inline
[Record, EventType, Value
A, Foo, 1
A, Bar, 1
C, Foo, 2];
Mar:
LOAD * Inline
[Record, EventType, Value
C, Foo, 5
C, Bar, 2
];
The final result set should look like this:
Final:
LOAD * Inline
[Record, EventType, Value
A, Foo, 1
A, Bar, 1
B, Foo, 2
C, Foo, 5
C, Bar, 2];
So my thought was to start by loading the most recent data set, then concatenate the previous sets, but only for the rows where the combination of Record and EventType doesn't already exist in the Final table. I thought to use Where Not Exists(), but I need it to key off of the combination of Record and EventType. I'm sure this is just a syntax thing, but I'm not getting the results I expected.
Thanks a lot for the help!
Eric
Are you thinking of something like this:
Mar:
LOAD *, Record & EventType as Key Inline
[Record, EventType, Value
C, Foo, 5
C, Bar, 2
];
Feb:
LOAD *, Record & EventType as Key Inline
[Record, EventType, Value
A, Foo, 1
A, Bar, 1
C, Foo, 2]
where not exists(Key, Record&EventType);
Jan:
LOAD *, Record & EventType as Key Inline
[Record, EventType, Value
A, Foo, 5
A, Bar, 3
B, Foo, 2]
where not exists(Key, Record&EventType);
drop field Key;
[edit] Another approach could look like this:
INPUT:
LOAD *, 1 as TableID Inline
[Record, EventType, Value
A, Foo, 5
A, Bar, 3
B, Foo, 2];
LOAD *,2 as TableID Inline
[Record, EventType, Value
A, Foo, 1
A, Bar, 1
C, Foo, 2];
LOAD *,3 as TableID Inline
[Record, EventType, Value
C, Foo, 5
C, Bar, 2
];
RESULT:
NoConcatenate
LOAD Record, EventType, FirstSortedValue(Value, -TableID) as Value
Resident INPUT
group by Record, EventType;
drop table INPUT;
Ahh I like that second approach. I'll give that a shot.
I was trying to do something like this:
Result:
LOAD * RESIDENT Mar;
CONCATENATE LOAD * RESIDENT Feb WHERE NOT (Exists(Record) AND Exists(EventType));
CONCATENATE LOAD * RESIDENT Jan WHERE NOT (Exists(Record) AND Exists(EventType));