Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hello everyone,
I'm struggling with a script where the keys are not working as expected.
I have a fact table with only three fields (although I have more in total, but created this in order to simplify), and later I create a final table with all the fields I need, properly transformed.
After that, I need to create a table with some filters and specific requirements.
rds_telemonitoring_alarm:
LOAD id as alarm_id,
account_number,
IF(date(FLOOR(datetime_edp_preprocessing)) >= Today() - 30, 'Lower 30 days', 'Greater 30 days') as has_recent_alarm_30_days //NEW2
FROM [lib://QVD.01.Stage/FSM/tlm_historic_alarm.qvd]
(qvd);
//---------------------------------------------------------
rds_telemonitoring_alarm_final:
NOCONCATENATE
LOAD
alarm_id,
account_number,
has_recent_alarm_30_days
RESIDENT rds_telemonitoring_alarm;
DROP TABLE rds_telemonitoring_alarm;
//---------------------------------------------------------
accounts_active_with_alarms_last_30_days:
LOAD
account_number, // Identificador único de la cuenta
// Count(alarm_id) as num_alarms,
IF( (Max(Date) >= Today() - 30), 'Lower 30 days', 'Greater 30 days') as has_recent_alarms
RESIDENT rds_telemonitoring_alarm_final // Fuente de datos: tabla cargada con las alarmas
WHERE
// Alarmas ocurridas en los últimos 30 días
telemonitoring_contract_active_flag__c = 1 // Cuenta con contrato TLS activo
AND (tls_contract_end_date__c > Today() OR IsNull(tls_contract_end_date__c)) // Contrato sin fecha de fin o aún vigente
GROUP BY
account_number;
However, when I select "Has Recent Alarms" > "Lower 30 Days", it doesn't return alarms from the last 30 days. Instead, it shows alarms with a datetime older than 30 days.
With the ERD,
Hope anyone can help me out
Thank you
Hi @migueldfr_23, sorry for the issue, i really didn't notice that some values have "-"...
So I fix the issue like this:
Try this, I think this helps you.
Make sure that your timestamp is really a timestamp and not a string or containing any other invalid data.
Personally I wouldn't tend to apply such expensive approach with an intermediate resident table and aggregating results in an extra dimension table else creating an appropriate cluster-information directly within the facts, for example with something like:
floor((MyDate - today()) / 30) as Alarm30Cluster
Hi, @migueldfr_23 Could you send a small sample of your dataset for testing (A small sample of rows is enough.)?
Also, could you send the full creation script for these three tables?
Hello @marcus_sommer,
Thank you for prompt response
the table has been implemented with this field which is Date.
rds_telemonitoring_alarm:
LOAD id as alarm_id,
account_number,
date(FLOOR(datetime_edp_preprocessing)) AS Date,
IF(date(FLOOR(datetime_edp_preprocessing)) >= Today() - 30, 'Lower 30 days', 'Greater 30 days') as has_recent_alarm_30_days //NEW2
FROM [lib://QVD.01.Stage/FSM/tlm_historic_alarm.qvd]
(qvd);
Thank you
Hello @savioricardog
Thank you for such a prompt response.
Here you have a sample of rows .xlxs, and also I have sent out all the script created.
Let me know if you need further samples.
Thank you
Here is the example with 10 rows.
Thanks
I tried the code above and it works... (I guess you’re having issues because the join between the table rds_telemonitoring_alarm and accounts_active_with_alarms_last_30_days is only based on account_number, and it can have duplicate values)
Try this:
Hello
Could you explain me why you are doing this middle table
LEFT JOIN (TEST)
LOAD
SubField(DateAccountKey, '-', 1) &'-'& MAX(SubField(DateAccountKey, '-', 2)) AS DateAccountKey,
SubField(DateAccountKey, '-', 1) AS account_number_max
RESIDENT TEST
GROUP BY SubField(DateAccountKey, '-', 1);
Thanks
Hi @migueldfr_23,
I used this middle table to create the field "account_number_max" and at the same time, build a consistent join key that I could use between accounts_active_with_alarms_last_30_days and TEST.
The issue I identified in your case is that if you join only on "account_number", there can be multiple matching rows for the same account (because it has multiple dates). When that happens, a join multiplies the results and can filter or distort the data unexpectedly.
By first calculating the latest date per account and creating a composite key (account_number + latest date), I ensure the join happens only on that specific latest record for each account. This avoids the one-to-many join problem and keeps the results consistent.
Thanks for such a good response, but I am still having issue with the order of the script.
From my knowledge, I will set what I had before ...