Qlik Community

Qlik Design Blog

All about product and Qlik solutions: scripting, data modeling, visual design, extensions, best practices, etc.

Creating Reference Dates for Intervals

Last week I wrote about how to create reference dates for exchange rates and warehouse balances: How to populate a sparsely populated field.

 

There is however also a second case where you want to create reference dates, but the data model is quite different. It is when you have a list of contracts with validity periods: Rental contracts, Insurances policies, Healthcare commitments, etc.

 

Each contract has a begin day and an end day. The analysts of an insurance company would probably want to ask the question: “How many valid insurance policies did we have on this specific day?” In other words, he wants to click on a reference date to see the count of policies that are associated with this date – even though this date doesn't exist in the source data.

 

The solution is to first load all policies in one table. Then load a second table that has one record per policy and date (Policies_x_Dates). This second table can in turn link to a master calendar.

 

Data Model.png

 

But the middle table does not exist in the source database. Instead, you must generate it using a While loop that loops over each record in the Policies table, i.e. each source record will be loaded several times. This way, you can load not only the “From” date and the “To” date, but also all dates in between:

 

Policies:
Load PolicyID, BirthDate, PolicyAmount, FromDate, ToDate, OtherPolicyAttribute
          From Policies;

 

Policies_x_Dates:

Load PolicyID,
          Age( FromDate + IterNo() – 1, BirthDate ) as Age,
          PolicyAmount / (ToDate - FromDate + 1) as DailyAmount,
          Date( FromDate + IterNo() – 1 ) as ReferenceDate
          Resident Policies
          While IterNo() <= ToDate - FromDate + 1 ;

 

Note that the Policies table has exactly one record per insurance policy, and the newly created Policies_x_Dates table has exactly one record per combination of policy and date. Note also that there are other fields that should be put in the Policies_x_Dates table, e.g., the age of the insured person, since this depends on the reference date. Further, it is possible to break up a cost or an income into daily amounts, which is useful when you want to show the correct amount distributed over the year.

 

Amount over time.png

 

The While loop is a very useful tool whenever you need to create additional tables in the data model. It is often a better option than the IntervalMatch.

 

Read more in Generating Missing Data In QlikView.

 

HIC

29 Comments
Partner
Partner

Excellent post Henric!

This gives me a whole new way of doing things I previously took a long time to do in scripts and create redundant statements. I guess we could extend this into Cumulative Running Total for inventory as well.

0 Likes
1,996 Views
maxim_senin
Contributor III

Kudos to Henric!

0 Likes
1,996 Views
Not applicable

Why might this be a better option than IntervalMatch?

1,996 Views

Don't get me wrong - intervalmatch is an excellent solution, If you have applications that work and that use intervalmatch, you should keep them as they are.

It is just that a solution using a While loop is sometimes more straightforward and easier to understand than a solution based on intervalmatch. (You need to make a script that someone else can read and manage...)

For example: If you solved the example above using intervalmatch, you could not as easily create the fields Age and DailyAmount in the IntervalMatch table.

Further, in situations where you have slowly changing dimensions, e.g. in a case where the salesperson changes department and you need to create a composite key based on both date and salesman, the intervalmatch solution is slightly more complex than a solution using a While loop.

HIC

1,996 Views
Not applicable

Hi Henric,

this post is really helpful considering what I've been trying to solve at the moment:

I've some entities with a starting date and a closing one, and a state, and I've to count a penalty for each day after the second between the starting and the closing date. (i.e. if it started on April 1st and closed on April 10th, I've to count a penalty for each day between April 4th to April 10th). Then I'll have to graph these penalties on a daily bases to see how much the total grows up day by day.

0 Likes
1,996 Views

Sounds like a while loop will solve the problem easily. Good luck!

I am glad that I could help you.

HIC

0 Likes
1,996 Views
Not applicable

Hi Henric

Great worked well for me this option as opposed to interval match but I do have a SYN table..should that be expected?

Thanks

David

0 Likes
1,996 Views

No, normally you should not have a synthetic key. Could it be that you have included more than one one field from the contracts table (or equivalent) when you make the While loop?

HIC

0 Likes
1,996 Views
Not applicable

Your right Sorry..should have worked that out myself!! now sorted thanks

0 Likes
1,996 Views
Not applicable

Thank you Henric!  Using this in a solution for a client that is not using a master calendar linked to the data model.

0 Likes
1,996 Views
rajeshvaswani77
Valued Contributor III

Thanks Henric, this is an interesting blog.

0 Likes
1,996 Views
carbal1952
Contributor II

Thanks Henric:

I follow every one of your post & documents.

I made some kind of Knowledge Base with them

by the way ...

hic

Why did you change your last name ?

CB

0 Likes
1,996 Views

No, the name hasn't changed. It's still Cronström. Just some software problems since we upgraded the software on which QlikCommunity is running...

HIC

0 Likes
1,996 Views
peschu123
Contributor III

Thanks Henric for your great continous work .

UTF-8 for the win!

0 Likes
1,996 Views
wizardo
Contributor III

Hi,

I loved that post, in qlikview many times similar things can be done id different and varied ways.

Its always nice to meet an approach that shades the light from a different   point/angle/height/perspective.

It gives one hope that when he stumbles upon difficulties in his own way, someone can come along and shade the light on something not seen before:)

hope which i admit i posses.

In my case i have a data model that contains 5 main Fact tables that in turn contain many of those pairs of dates.

E.G. creation and deletion date, offered and converted date, etc etc.

Lets say about 5-6 pairs per main table. and each pair may be related to a different attribute.

In light of your post what would be your recommendation in dealing with this situation?

Creating a big combination date-attribute  table in between the main calendar and my link-table?

Using your method? or intervalMatch.

Thanks again anyways for a great post (one of many, many of yours)

Daniel

0 Likes
1,996 Views
gerhard_jakubec
New Contributor III

Hi Henric,

thanks for this very straight forward post. If i got it right i'd say in your example the Policies_x_Dates table would be your fact table (although it's derived from Policies dimension) whereas Policies and Master_Calendar make up your dimensions. However this approach increases data amount by ("# of policies" * "days ot policy duration") so i try to work out an alternative approach to exactly this requirement based on only two fact rows per policy (to reduce data volumy by "days of duration" - 2). Your main fact table would then hold only two records for each single policy.

ReferenceDate, PolicyID, xType

01.01.2000, 1, 'From Date'

31.12.2009, 1, 'To Date'

However i am afraid this concept leads to much more complex scripts and bad performance since almost all KPIs (e.g. Age) can not be stored inside the fact table any more and will have to be calculated on the fly. What do you think about it?

p.s.: also you have to use count( distinct PolicyID) to get the total number of policies correct?

0 Likes
1,996 Views

You are right that it increases the data amount, but I don't see any way around it. The alternative is - just as you have noticed - that you need to calculate complex things on the fly and create a "kludgy" solution.

No, the best solution is to use a Policies_x_Dates table. And using a Count(distinct PolicyID) is not a problem. See http://community.qlik.com/blogs/qlikviewdesignblog/2013/10/22/a-myth-about-countdistinct

HIC

0 Likes
1,996 Views
upaliwije
Contributor II

Hi Henric

Please elaborate me how a P&L statement could be prepared using your above method insted of interval match

0 Likes
1,996 Views
Not applicable

HIC, such a great tipp

Thx for sharing.

Grüße aus Düsseldorf 🙂

/olli

0 Likes
1,996 Views
Not applicable

Hi Henric,

This post is very very excellent !!

My question is : What should we do if we have more than two date ?

For example. Suppose a business which treat claims.

So, In their Excel File we have

    Claim ID

    Arriving Date of Claim in the business

    Arriving Date of Claim in the Claim Service

    Treatment Date of the Claim

    Customer ID who sent Claim

    Sending Date of the Claim to the customer

    Empoyee who treated the claim

The analyst want to be able to select a date and see many KPI calculated from differents date.

How to do it Please ?

0 Likes
1,996 Views

You should create one master calendar per date:

Why You sometimes should Load a Master Table several times.

And, in addition, you should create a Canonical date that can represent any date:

Canonical Date.

You probably also want to measure the time it took to treat a Claim, by calculating

  [Arriving Date] - [Treatment Date] as Delay. See more on Buckets.

HIC

0 Likes
1,996 Views
Not applicable

Hi Henric,

This indeed is an incredible use of reference dates for interval.  I must also add that I have applied your  suggestions on the other application – and it works superbly. Also I would like to share a free resource on qlikview which is very useful : https://www.youtube.com/watch?v=8olMt2AOUJ8

0 Likes
1,996 Views
Not applicable

@Henric..and Others...I am new to Qlik. Can you help me here please? I have Transaction data and Discount tiers. I want to match both and would like to have desired results as described below..

I placed complete file in my brand new question...

Need help in computing Discounts based on Min and Max limits (by Customer)

Thanks for your help.

Transaction Data sample:

C_YearC_QtrDateProductCustomer NumberQuantityMember Type
2014412/1/2014Apple1000140AAA
2014412/1/2014Orange1000155AAA
2014412/1/2014Banana1000175AAA
2014412/1/2014Apple1000260BBB
2014412/1/2014Orange1000263BBB
2014412/1/2014Banana1000244BBB
2014412/1/2014Apple1000349CCC
2014412/1/2014Orange1000354CCC
2014412/1/2014Banana1000359CCC
2014412/2/2014Apple1000140AAA

Discount Tiers Data (Sample):

C_YearC_QtrProductDiscount_TypeMember_TypeTierStartEndDiscount%
20144AppleGAAAG_Tier_00340
20144AppleGAAAG_Tier_13549.91
20144AppleGAAAG_Tier_25064.992
20144AppleGAAAG_Tier_3659999993
20144AppleGBBBG_Tier_13549.91
20144AppleGBBBG_Tier_25064.992
20144AppleGBBBG_Tier_3659999993
20144AppleGCCCG_Tier_110252
20144AppleGCCCG_Tier_226503

Desired Output:

C_YearC_QtrProductCustomer NumberMember TypeQuantity (Sum of Quarter)Discount_TypeTierDiscount%
20144Apple10001AAA80GG_Tier_33
20144Orange10001AAA150GG_Tier_33
20144Banana10001AAA110GG_Tier_33
20144Apple10002BBB120GG_Tier_33
20144Orange10002BBB88GG_Tier_33
20144Banana10002BBB126GG_Tier_33
20144Apple10003CCC98GG_Tier_33
20144Orange10003CCC118GG_Tier_34
20144Banana10003CCC108GG_Tier_34

0 Likes
1,996 Views
sanjyotpatkar
Contributor III

Hello Henric

  am using a combination of Canonical Calendar and the above calendar concept to calculate missing dates.need to show a trend of outstanding tickets across one common month axis.

Outstanding ticket count as of 28 Feb = all tickets with blank [Resolution Date],[Resolved Date] and [Resolve date UTC] while being opened in Feb or prior to Feb. +

Tickets that were opened in Feb or prior to Feb but resolved after 28 Feb.

Challenge that I am facing is that even if I use set analysis and flags while counting the outstanding tickets I am getting performance issues while clicking the listbox. Numbers get populated correctly but it is slowing down the application. Can you suggest what else should i consider while changing my script?

0 Likes
1,996 Views
Not applicable

Hi Henric,

I have 450,000 rows of records with dozens of fields, the performance is already slow with just my raw dataset. Is there a less expensive method that you would recommend? Creating a unique Policy ID x reference date table for each record seems like it would crash the app.

0 Likes
1,996 Views
Partner
Partner

If you have performance problems with only 450 000 rows and some dozen fields
then you have some other serious issue with your application and data modell.
On reasonably sized hardware that amount of rows should not be a problem.

1,996 Views
datanibbler
Esteemed Contributor

Hi,

that is indeed an excellent approach.

I think IntervalMatch() is pretty straightforward - you have to do a bit of thinking because the integrated help_file does not stick with one example for the simple and the advanced syntax, but it should not be a problem to think your way through it.

It does take longer than just keeping the table with just a start- and an enddate and using IntervalMatch - I just have a scenario and I also went for a loop generating one record for every day, but it takes quite a while - but since IntervalMatch can only deal with the intervals on one side - in one of the two tables there have to be discrete datapoints to fit in the intervals - I needed to do this for one of the two tables involved.

>>> What I don't quite understand - what is the advantage of having this three-tables-scenario instead of just "pumping up" the one table - that would be the "policies" table to stick with your example? (there would have to be one or two counter_tables,  but those can be deleted once the loop has completed, so that doesn't matter)

Thanks!

Best regards,

DataNibbler

0 Likes
1,996 Views
sanjyotpatkar
Contributor III

Hi..

I have a similar request to show all the dates and multiple KPIs with it.

I have used a combination of canonical calendar and reference dates approach and I have noticed that as the months start increasing the time to execute the file takes long and it slows down the performance. Do you have any other solution for it?

0 Likes
1,996 Views
PreranaM
New Contributor

 Hi Henric,

In the fact table I have 4 Date : CMFrom,CMTo,LCMFrom,LCMTo.

and In Date Range have 2 Date : FromDate, ToDate

i have multiple condition :

ToDate    >=LCMFrom and

                   <=CMTo  and

                    >=CMFrom  

FromDate    <= CMFrom  and

                        <=  LCMTo

can you please suggest something about this.

i am thinking to create 2 more date field in fact table just to create interval(TempMaxDate & TempMinDate) :

Inteval1 : ToDate    :

                    >=LCMFrom and

                     <=CMTo 

Interval2 :ToDate:

                               >=CMFrom   and 

                                  <=TempMaxDate (can set it to 2020 something)

Interval3 :FromDate:

                                       <= CMFrom  and

                                          >=TempMinDate (can set it to 1900 something)

Interval4 :FromDate:

                                       <= LCMTo and

                                          >=TempMinDate (can set it to 1900 something)

 

uptill now i have implemented 2 intervals  and its taking so much time to reload , i dnt know what will happen if i add 4 interval.

 

 

 

 

0 Likes
1,933 Views