Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hey,
Was wondering if anyone has ran into the following issue.
I am trying to find Closest date to another date while having multiple entries.
I calculate days between and filter for min datebetween, but the issue is that i get multiple min days when grouping for final table. first sorted does not work because a single customer can have multiple events with same days between usually 0. And i cannot think of any key that would work
looks something like this:
TMP:
NoConcatenate
LOAD
*
where DAYS_BETWEEN >= 0 ;
load*,
if(isNULL(days_between_TEMP),0,days_between_TEMP) as DAYS_BETWEEN;
load *, EVENT_DATE - RISK_DATE as days_between_TEMP
RESIDENT QUACK;
FINAL:
LOAD CUSTOMER_ID,
SCORE,
EVENT_DATE,
MIN(DAYS_BETWEEN) as DAYS_TO_RISK
Resident TMP
group by CUSTOMER_ID,EVENT_DATE,RISK_DATE,SCORE;
DROP TABLEs QUACK, TMP;
Hi @Zen_Not_On
I would like to help you, but I'm not understanding very well what you are trying to accomplish. Could you, please, provide a data sample with some examples and the outcome that you are expecting for them?
In this way it will be easier to analyze it.
Thank you
Daniel
Including the event/risk dates in the aggregation is preventing to get a single min() days-offset. I assume there is a logically conflict and/or you may need another data-structure. Maybe two tables - one with granular information and the other with the aggregated ones ...
Current output looks something like this: what i want is to get only where days to risk is the lowerst possible
| CUSTOMER_ID | SCORE | RISK_DATE | EVENT_DATE | DAYS_TO_RISK |
| 1 | HIGH_RISK | 19.11.2025 | 11.12.2025 | 22 |
| 1 | HIGH_RISK | 19.11.2025 | 12.12.2025 | 23 |
| 1 | HIGH_RISK | 19.11.2025 | 15.12.2025 | 26 |
| 1 | HIGH_RISK | 11.12.2025 | 11.12.2025 | 0 |
| 1 | HIGH_RISK | 11.12.2025 | 12.12.2025 | 1 |
| 1 | HIGH_RISK | 11.12.2025 | 15.12.2025 | 4 |
| 1 | HIGH_RISK | 12.12.2025 | 12.12.2025 | 0 |
| 1 | HIGH_RISK | 12.12.2025 | 15.12.2025 | 3 |
| 1 | STANDARD_RISK | 15.12.2025 | 15.12.2025 | 0 |
Like hinted just remove the dates from the aggregating, maybe something in this way:
t: load *, Date1 - Date2 as Offset from X;
left join(t) load Customer, min(Offset) as Offset resident t group by Customer;
Another approach may be to use interrecord-functions to order/classify the records, for example:
t: load *, Date1 - Date2 as Offset from X;
load *, if(Customer = previous(Customer), peek('Nr') + 1, 1) as Nr
resident t order by Customer, Offset desc;
yes, but the final result must contain all columns. If i remove the dates, i have no way of connecting it all back together
Then take a look on my second suggestion with the interrecord-functions. All records remain and are simply ordered and flagged.
Hi @Zen_Not_On
Do you mean like the behavior of the code below? It removes all the rows except the ones that have the lower days of risk per Customer and Score (I added a second ID to your output data to test it works for more than one).
Let me know if it works for you or if some adjustments are needed.
Kind Regards
Daniel