Calculate time series

alessio_agrimi · ‎2022-04-13

Hi everyone,

I'm developing a Qlik Sense app in which I want to calculate a time series starting from this table

ZMDM_DBL1:
LOAD * Inline [
ext_ui         , adat      , DC_EFF_EA          , DC_EFF_EI
IT002E9999991A , 04/04/2022, 05/04/2022 11:00:00, 05/04/2022 03:00:00
IT002E9999992A , 04/04/2022, 05/04/2022 11:00:00, 05/04/2022 17:00:00
IT002E9999993A , 04/04/2022, 05/04/2022 13:00:00, 05/04/2022 03:00:00
IT002E9999994A , 04/04/2022, 05/04/2022 16:00:00, 05/04/2022 17:00:00
IT002E9999991A , 05/04/2022, 06/04/2022 13:00:00, 06/04/2022 05:00:00
IT002E9999992A , 05/04/2022, 06/04/2022 14:00:00, 06/04/2022 18:00:00
IT002E9999993A , 05/04/2022, 06/04/2022 13:00:00, 06/04/2022 05:00:00
IT002E9999994A , 05/04/2022, 06/04/2022 14:00:00, 06/04/2022 18:00:00
];

Dates are in the format DD/MM/YYYY and timestamps in the format DD/MM/YYYY hh:mm:ss.

Starting from this table I want to calculate an hourly time series that, after selecting a single 'adat', counts the number of 'ext_ui' that has both 'DC_EFF_EA' and 'DC_EFF_EI' less or equal of that hour in the time series.

So, selecting 'adat' = '04/04/2022' I want a time series starting from the next day at 05/04/2022 00:00:00 and finishing afeter 5 days at 10/04/2022 00:00:00 that counts

from 05/04/2022 00:00:00 to 05/04/2022 10:00:00 --> 0 complete - 4 incomplete

from 05/04/2022 11:00:00 to 05/04/2022 12:00:00 --> 1 (IT002E9999991A) complete - 3 incomplete

from 05/04/2022 13:00:00 to 05/04/2022 16:00:00 --> 2 (IT002E9999993A) complete - 2 incomplete

from 05/04/2022 17:00:00 to 10/04/2022 00:00:00 --> 4 (IT002E9999992A, IT002E9999994A) complete - 0 incomplete

I tried two ways (both working) but wants to know if there is a more efficient way.

Starting from the creation of a calendar in the script:

// 1) Master calendar:
Temp: 
Load 
min(adat) as minDate,
max(adat) as maxDate
Resident ZMDM_DBL1;

Let vIntervalloMax = 120;
Let varMinDate = Num(Peek('minDate', 0, 'Temp')); 
Let varMaxDate = Num(Peek('maxDate', 0, 'Temp')); 
DROP Table Temp; 

TempCalendar:
LOAD 
Date($(varMinDate) + Iterno()-1) as adat,
Timestamp($(varMinDate) + (RecNo()-1)/24 + Iterno()) As AddedTimestamp
AUTOGENERATE $(vIntervalloMax)
While $(varMinDate) + IterNo()-1 <= $(varMaxDate);

1- Dynamic - directly in FrontEnd graph

I created a Time series Graph using:

Dimension --> Group: AddedTimestamp

Dimension --> Line:

= IF(SubStringCount(
IF(not(isnull(DC_EFF_EA)) AND DC_EFF_EA <> '' AND DC_EFF_EA <= AddedTimestamp,'S_','N_') &
IF(not(isnull(DC_EFF_EI)) AND DC_EFF_EI <> '' AND DC_EFF_EI <= AddedTimestamp,'S_','N_')
, 'N') = 0,
'Complete','Incomplete')

Measure:

count(ext_ui)

This solution works but takes long time to produce the graph.

2 - Calculate the time series in the script

To optimize the Front End graph, I tried to calculate the time series in the loading script:

LEFT JOIN (TempCalendar)
LOAD 
adat,
ext_ui,
DC_EFF_EA,
DC_EFF_EI
Resident ZMDM_DBL1;

MasterCalendar: 
Load 
  adat,
  AddedTimestamp,
count(
IF(SubStringCount(
IF(not(isnull(DC_EFF_EA)) AND DC_EFF_EA <> '' AND DC_EFF_EA <= AddedTimestamp,'S_','N_') &
IF(not(isnull(DC_EFF_EI)) AND DC_EFF_EI <> '' AND DC_EFF_EI <= AddedTimestamp,'S_','N_')
, 'N') = 0,
ext_ui
)) as Complete,
count(
IF(SubStringCount(
IF(not(isnull(DC_EFF_EA)) AND DC_EFF_EA <> '' AND DC_EFF_EA <= AddedTimestamp,'S_','N_') &
IF(not(isnull(DC_EFF_EI)) AND DC_EFF_EI <> '' AND DC_EFF_EI <= AddedTimestamp,'S_','N_')
, 'N') <> 0,
ext_ui
)) as Incomplete
Resident TempCalendar 
group by adat, AddedTimestamp
Order By adat ASC, AddedTimestamp ASC; 

Drop Table TempCalendar;

This way takes a lot in the script but the front end is greatly enhanced. The problem that I see is the JOIN between TempCalendar (120 record for each 'adat') and ZMDM_DBL1 (about 800.000 records for each 'adat')

Is there another way to achive this result?

Thanks,

Alessio

Client Managed

General Question