Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hey,
Was wondering if anyone has ran into the following issue.
I am trying to find Closest date to another date while having multiple entries.
I calculate days between and filter for min datebetween, but the issue is that i get multiple min days when grouping for final table. first sorted does not work because a single customer can have multiple events with same days between usually 0. And i cannot think of any key that would work
looks something like this:
TMP:
NoConcatenate
LOAD
*
where DAYS_BETWEEN >= 0 ;
load*,
if(isNULL(days_between_TEMP),0,days_between_TEMP) as DAYS_BETWEEN;
load *, EVENT_DATE - RISK_DATE as days_between_TEMP
RESIDENT QUACK;
FINAL:
LOAD CUSTOMER_ID,
SCORE,
EVENT_DATE,
MIN(DAYS_BETWEEN) as DAYS_TO_RISK
Resident TMP
group by CUSTOMER_ID,EVENT_DATE,RISK_DATE,SCORE;
DROP TABLEs QUACK, TMP;
Hi @Zen_Not_On
I would like to help you, but I'm not understanding very well what you are trying to accomplish. Could you, please, provide a data sample with some examples and the outcome that you are expecting for them?
In this way it will be easier to analyze it.
Thank you
Daniel
Including the event/risk dates in the aggregation is preventing to get a single min() days-offset. I assume there is a logically conflict and/or you may need another data-structure. Maybe two tables - one with granular information and the other with the aggregated ones ...