Skip to main content
Announcements
Have questions about Qlik Connect? Join us live on April 10th, at 11 AM ET: SIGN UP NOW
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

Gapless period of time

Hi folks,

I have a log which documents the transfer of data by our customers and I want to ensure that they've sent data for each day. Now they can send each day individually or a whole patch or even both. For example:

CustomerDatefromDateto
000101.08.201216.08.2012
000101.08.201202.08.2012
000117.08.201217.08.2012
000118.08.201218.08.2012
004201.08.201216.08.2012
004217.08.201218.08.2012
133701.08.201204.08.2012
133702.08.201203.08.2012
133704.08.201204.08.2012
133706.08.201206.08.2012
133710.08.201218.08.2012

In the period 01.08. - 18.08. the customers 0001 und 0042 transmitted data for each day, customer 1337 failed to do so for the 05.08.. As result I need a list which customer failed on which day to transmit data.

5 Replies
Not applicable
Author

Where are my manners... Could someone please help me? Thanks for each suggestion!

sbaldwin
Partner - Creator III
Partner - Creator III

look into the intervalmatch function to count the number of days actually received vs the days between the max and min dates, below is an example bit of code that will do this in the script and write out the errors to a table.

hope this helps

Thanks

Steve

SET ThousandSep=',';
SET DecimalSep='.';
SET MoneyThousandSep=',';
SET MoneyDecimalSep='.';
SET MoneyFormat='£#,##0.00;-£#,##0.00';
SET TimeFormat='hh:mm:ss';
SET DateFormat='DD/MM/YYYY';
SET TimestampFormat='DD/MM/YYYY hh:mm:ss[.fff]';
SET MonthNames='Jan;Feb;Mar;Apr;May;Jun;Jul;Aug;Sep;Oct;Nov;Dec';
SET DayNames='Mon;Tue;Wed;Thu;Fri;Sat;Sun';

data_store:
LOAD * INLINE [
Customer,Datefrom,Dateto
1,01/08/2012,16/08/2012
1,01/08/2012,02/08/2012
1,17/08/2012,17/08/2012
1,18/08/2012,18/08/2012
42,01/08/2012,16/08/2012
42,17/08/2012,18/08/2012
1337,01/08/2012,04/08/2012
1337,02/08/2012,03/08/2012
1337,04/08/2012,04/08/2012
1337,06/08/2012,06/08/2012
1337,10/08/2012,18/08/2012
];

data:
noconcatenate LOAD * Resident data_store;

temp:
load min(Datefrom) as mmin, max(Dateto) as mmax
Resident data;


let v_min = peek('mmin',0,'temp');
let v_max = peek('mmax',0,'temp');

drop table temp;

temp:
load
date($(v_min)+recno()-1) as tempdate
AutoGenerate ($(v_max)-$(v_min)+1);


join (data)
IntervalMatch(tempdate)
  load distinct Datefrom,Dateto Resident data;

drop table temp;

temp:
noconcatenate load Customer, 1+(max(Dateto)  - min(Datefrom)) as daterange, count(distinct tempdate) as datecount
Resident data
group by Customer;

audit:
noconcatenate load distinct Customer ,daterange,datecount
Resident temp where daterange <> datecount;
drop table temp;

drop table data;

Not applicable
Author

Hi Steve,

this code helps alot on small datasets. Unfortunally I have a huge - I mean HUGE - dataset. Roughly 5.5 million entries. I tested your code with 700k and it took 20GB RAM, the 5 million set crashed after several hours and 120GB RAM usage.

Thanks for the help but I'm going to need something else.

Greetings


Markus

Not applicable
Author

I tried to replace the join(DATA) part, but I'm not going anywhere right...

sbaldwin
Partner - Creator III
Partner - Creator III

ok heres a thought, what if you load the table into a temp table, then load it again to another but sorting by first the customer then the DateTo, then you can use the peek function to check the previous date that is smaller than the currently loaded DateFrom , like this:



temp:
LOAD
Customer,
num(Datefrom) as Datefrom,
num(Dateto) as Dateto
INLINE [
Customer,Datefrom,Dateto
1,01/08/2012,16/08/2012
1,01/08/2012,02/08/2012
1,17/08/2012,17/08/2012
1,18/08/2012,18/08/2012
42,01/08/2012,16/08/2012
42,17/08/2012,18/08/2012
1337,01/08/2012,04/08/2012
1337,02/08/2012,03/08/2012
1337,04/08/2012,04/08/2012
1337,06/08/2012,06/08/2012
1337,10/08/2012,18/08/2012
];


data:
noconcatenate load
Customer,
Datefrom,
Dateto,
if( Datefrom <= peek('Dateto')+1 ,0, if(Customer = peek('Customer') ,1,0)) as check
Resident temp
order by Customer,Dateto;

drop table temp;