Skip to main content
Woohoo! Qlik Community has won “Best in Class Community” in the 2024 Khoros Kudos awards!
Announcements
Nov. 20th, Qlik Insider - Lakehouses: Driving the Future of Data & AI - PICK A SESSION
cancel
Showing results for 
Search instead for 
Did you mean: 
alexis
Partner - Specialist
Partner - Specialist

Intervalmatch() or Date comparison?

I have simplified my problem with this following example:

_test1.jpg

I have a Calendar (as above) and a file (called Usage here) that has records that contain a StartDate an EndDate and a SubscriberKey. The 2 tables need to be disjointed (remember this is a small part of a very large application).

I need to see a distribution (by Month-Year) of active subscribers ( i.e. count(distint (SubscriberKey)))  - the way that I ascertain that a subscriber is active for a specific month is to see whether they have a StartDate that is either inside or before the month and an EndDate that is no less than the beginning of the month.

As an example, active subscribers for the month of "May 2014" are those that have a StartDate that is less than or equal 31/5/2014 and an EndDate that is greater or equal 1/5/2014. I have already ensured that the EndDate is greater than the StartDate.

_test2.jpg

I  attach the 2 QVDs that make up the data set for the above.

This is not a classic intervalmatch scenario (I don't think) as I am not charting by the Start/EndDate intervals but merely using these fields to establish inclusion.

Any assistance would be much appreciated.

Alexis

1 Solution

Accepted Solutions
rwunderlich
Partner Ambassador/MVP
Partner Ambassador/MVP

In your data I would:

Matches:

IntervalMatch (Date)

LOAD StartDate, EndDate RESIDENT TestUsage;

Then you can create a chart with dimension=MonthYear and Expression= count(DISTINCT SubscriberKey).


With your sample data, you get a 48M row Matches table. This gives you a lot of flexibility, but depending on the size of your final data you may want to roll up that data into some monthly figures and drop the detail.


Example attached.

-Rob

View solution in original post

9 Replies
Not applicable

Hi Alexis,

Can you give a try for the below:

In your calendar load part, add those 2 fields:

Floor(MonthStart(DateID)) as StartOfMonth,

Floor(MonthEnd(DateID)) as EndOfMonth

Then the expression would be:

count(distinct if(StartDateInt <= EndOfMonth and EndDateInt >= StartOfMonth, SubscriberKey))

Hope this gives you the desired results.

alexis
Partner - Specialist
Partner - Specialist
Author

Thank you for that - that was very helpful.

There is in fact another criterion (the Status flag) which I have added and it seems to give the right results.

The formula now looks as follows:

= count(distinct if(StartDateInt <= EndOfMonth and EndDateInt >= StartOfMonth and Status = 3, SubscriberKey))

Comments on this approach:

a) Seems to be quite slow in recalculating when selections are made

b) Is there a SET ANALYSIS version of this that you can think of that might work. This would be useful for using the "1" and "$" notation to ignore selections etc..

I tried without success:the below:

= count(distinct {$<[StartDateInt] = {"<=[EndOfMonth]"}, [EndDateInt] = {">=[StartOfMonth]"}, [Status] = {3} >}

SubscriberKey)

Anybody spot anything wrong!

Alexis

rwunderlich
Partner Ambassador/MVP
Partner Ambassador/MVP

"This is not a classic intervalmatch scenario (I don't think) as I am not charting by the Start/EndDate intervals but merely using these fields to establish inclusion."


In my mind, it is a classic IntervalMatch application. IntervalMatch will connect each Subscriber to Date rows in the Calendar table. Those are all the Dates the Subscriber was active. Then you can count(DISTINCT SubscriberKey) across any of the dimensions in the Calendar table.


For a similar example, see

Qlikview Cookbook: Expand a Pricing Date Table http://qlikviewcookbook.com/recipes/download-info/expand-a-pricing-date-table/


-Rob

alexis
Partner - Specialist
Partner - Specialist
Author

Hi Rob

Thank you for taking the time to reply.I am a big fan of many of your postings and articles.

I am not sure how your example relates to the example that I posted - can you be so kind to look at my example and show me how to resolve my issue.

Instead of loading my calendar.qvd you can generate the calendar using the code as suggested ealrier on this thread:

Calendar:

Load

  TempDate As DateID,

  Date(TempDate) As Date,

  Year(TempDate) As Year,

  Month(TempDate) As Month,

  Day(TempDate) As Day,

  WeekDay(TempDate) As WeekDay,

  Week(TempDate) As Week,

  'Q' & Ceil(Month(TempDate)/3) As Quarter,

  Date(MonthStart(TempDate), 'MMM-YYYY') As MonthYear,

  Floor(MonthStart(TempDate)) As StartOfMonth,

  Floor(MonthEnd(TempDate)) As EndOfMonth

  ;

Load

  RecNo() -1 + $(vStartDate) As TempDate

AutoGenerate ($(vNumDays));

My chart is using the expression:

= count(distinct if(StartDateInt <= EndOfMonth and EndDateInt >= StartOfMonth and Status = 3, SubscriberKey))

Thanks

Alexis

alexis
Partner - Specialist
Partner - Specialist
Author

Hi again,

Based on your suggestion it made me realise that the expression can be further simplified in order to enable you to report by ANY of the calendar dimensions not just Month-Year:

= count(distinct if(StartDateInt <= DateID and EndDateInt >= DateID and Status = 3, SubscriberKey))

The only problem is that it is very slow to calculate every time a selection is made or the level of granularity is changed (say from Month to week) - I have a feeling SET ANALYSIS will help. Can someone help redefine the above removing the "If" statement?

Thanks again

Alexis

rwunderlich
Partner Ambassador/MVP
Partner Ambassador/MVP

In your data I would:

Matches:

IntervalMatch (Date)

LOAD StartDate, EndDate RESIDENT TestUsage;

Then you can create a chart with dimension=MonthYear and Expression= count(DISTINCT SubscriberKey).


With your sample data, you get a 48M row Matches table. This gives you a lot of flexibility, but depending on the size of your final data you may want to roll up that data into some monthly figures and drop the detail.


Example attached.

-Rob

alexis
Partner - Specialist
Partner - Specialist
Author

Thanks Rob

This works perfectly although the presence of the synthetic keys is a little messy and off-putting.

Is there any way of removing the synthetic keys?

Thanks again for helping out

Alexis

Gysbert_Wassenaar

Henric Cronström says this in his blog post IntervalMatch about the synthetic key that is created by the intervalmatch:

Further, the data model contains a composite key (the FromDate and ToDate fields) which will manifest itself as a QlikView synthetic key. But have no fear. This synthetic key should be there; not only is it correct, but it is also optimal given the data model. You do not need to remove it.


talk is cheap, supply exceeds demand
alexis
Partner - Specialist
Partner - Specialist
Author

Thanks for the heads-up Gysbert - in an already "busy" data structure the fewer table the better so I was hoping to be able to remove the synthetic tables...

BR

Alexis