Qlik Community

Qlik Design Blog

All about product and Qlik solutions: scripting, data modeling, visual design, extensions, best practices, etc.

Canonical Date

 

A common situation when loading data into a Qlik document is that the data model contains several dates. For instance, in order data you often have one order date, one required date and one shipped date.

 

Base model.png

 

This means that one single order can have multiple dates; in my example one OrderDate, one RequiredDate and several ShippedDates - if the order is split into several shipments:

 

Logic 1.png

 

So, how would you link a master calendar to this?

 

Well, the question is incorrectly posed. You should not use one single master calendar for this. You should use several. You should create three master calendars.

 

The reason is that the different dates are indeed different attributes, and you don’t want to treat them as the same date. By creating several master calendars, you will enable your users to make advanced selections like “orders placed in April but delivered in June”. See more on Why You sometimes should Load a Master Table several times.

 

Your data model will then look like this:

 

Model with spec calendars.png

 

But several different master calendars will not solve all problems. You can for instance not plot ordered amount and shipped amount in the same graph using a common time axis. For this you need a date that can represent all three dates – you need a Canonical Date. This is how you create it:

 

First you must find a table with a grain fine enough; a table where each record only has one value of each date type associated. In my example this would be the OrderLines table, since a specific order line uniquely defines all three dates. Compare this with the Orders table, where a specific order uniquely defines OrderDate and RequiredDate, but still can have several values in ShippedDate. The Orders table does not have a grain fine enough.

 

This table should link to a new table – a Date bridge – that lists all possible dates for each key value, i.e. a specific OrderLineID has three different canonical dates associated with it. Finally, you create a master calendar for the canonical date field.

 

Full model.png

 

You may need to use ApplyMap() to create this table, e.g. using the following script:

 

     DateBridge:
     Load
          OrderLineID,
          Applymap('OrderID2OrderDate',OrderID,Null()) as CanonicalDate,
          'Order' as DateType
          Resident OrderLines;

     Load
          OrderLineID,
          Applymap('OrderID2RequiredDate',OrderID,Null()) as CanonicalDate,
          'Required' as DateType
          Resident OrderLines;

     Load
          OrderLineID,
          ShippedDate as CanonicalDate,
          'Shipped' as DateType
          Resident OrderLines;

 

If you now want to make a chart comparing ordered and shipped amounts, all you need to do is to create it using a canonical calendar field as dimension, and two expressions that contain Set Analysis expressions:

 

     Sum( {$<DateType={'Order'}>} Amount )
     Sum( {$<DateType={'Shipped'}>} Amount )

 

Bar chart.png

 

The canonical calendar fields are excellent to use as dimensions in charts, but are somewhat confusing when used for selections. For this, the fields from the standard calendars are often better.

 

Summary:

  • Create a master calendar for each date. Use these for list boxes and selections.
  • Create a canonical date with a canonical calendar. Use these fields as dimension in charts.
  • Use the DateType field in a Set Expression in the charts.

 

A good alternative description of the same problem can be found here. Thank you, Rob, for inspiration and good discussions.

 

HIC

153 Comments
cbushey1
Contributor III

hic‌,

I am hoping you might be able to quickly assist. I like your method here for creating one calendar when it is necessary and can certainly see the benefits of having the individual ones. With that said, I think something is off with my Canonical Calendar because when I select CanonicalYear = 2014, I show my three date fields  (Expired, FocusDate, CodeDate) as (03/23/2014,02/09/2015,03/23/2014). I wouldn't have expected the FocusDate to show since its year is 2015.

Any thoughts on why this might be?

Chase

0 Likes
356 Views

This is most likely completely correct.

By selecting CanonicalYear=2014, you pick out all the OrderIDs (or whatever you use as key) where one of the three days belongs to 2014 - but not necessary all three. So, if you have an OrderID with ExiredDate in 2014, it may well have a FocusDate that belongs to 2015.

HIC

0 Likes
356 Views

Anyone know why it is called a "Canonical Date" ?

0 Likes
356 Views
Luminary
Luminary

"Canonical" is a term used in computer science, physics, mathematics and other fields to explain that something is on a standardised form, where each object has a unique representation.

In the case of a calendar, a canonical calendar would be one that has a entry date for each date, like HIC described.

Think of a canonical calendar as "one calendar to rule them all", with only one record for each unique date.

A "normal" form is similar, with the difference that objects (e.g. dates) do not need to have unique representations.

It's been oh-so-many years since I learnt this stuff at university... so I am sure someone will correct me and/or provide additional details.

356 Views

Because I chose to call it this when I wrote the blog.

The concept of an official, proper, unified, generalized date in the data model existed before the blog post - but it didn't have a name. In one discussion I had, an English-speaking user described it as a "canonical form" of the date. This term stuck in my head, and I used it when writing the blog.

HIC

0 Likes
356 Views
Luminary
Luminary

...and of course Wikipedia has a pretty extensive article on the topic...

Canonical form - Wikipedia, the free encyclopedia

0 Likes
356 Views

That explains why I could find no suitable explanation of "Canonical Date" on the web.

Thanks for the clarification, I was starting to worry the web had broken.

0 Likes
356 Views
Partner
Partner

Great Post.

0 Likes
356 Views
jkampmeijer
New Contributor II

Thanks for the post.

I will use the Canonical Date solution in my dashboard.

0 Likes
356 Views
nizamsha
Valued Contributor II

Hi Henric,

I am little bit confused about the mapping in the bridge table here is your code.

BridgeTable:

    Load OrderLineID, Applymap('OrderID2OrderDate',OrderID,Null()) as CanonicalDate, 'Order' as DateType

          Resident OrderLines;

     Load OrderLineID, Applymap('OrderID2RequiredDate',OrderID,Null()) as CanonicalDate, 'Required' as DateType

          Resident OrderLines;

     Load OrderLineID, ShippedDate as CanonicalDate, 'Shipped' as DateType

          Resident OrderLines;

In Bridge table you have 3 columns and they are DateType,OrderLineID,CanonicalDate

In Order Table we dont have OrderLineID then how did you created the 'Order' as DataType and 'Required' as DataType my Question how did you take these column if i am not wrong  Your first   mapping is from OrderLineTable

and the column are orderID and orderLineITem based on this you might have used the applymap in ordertable and then you might have created 2 mapping table one is OrderID2OrderDate and the second one is OrderID2RequiredDate if it so you might have created a bridge table using these mapping tables right orders and required is from OrderTable but how did you done from OrderLine  i am little bit confused how you have taken the mapping kindly clarify where was the gap

0 Likes
356 Views

The first step is to create the two mapping tables: OrderID2OrderDate and OrderID2RequiredDate. These are created from the Orders table, i.e. one record per order.

Then I create the bridge table using the OrderLines table, i.e. one record per OrderLineID. This table also contains the OrderID, which is used to get the two dates from the mapping tables. This way I can link the OrderLinesID to the OrderDate and the RequiredDate.

HIC

0 Likes
356 Views
adamdavi3s
Honored Contributor

Hi Henric,


What sort of performance implications does this have?

I am just playing with a dashboard which has a 9m row fact and 5 dates, so the bridge is 45m rows.

The dashboard behaves as expected until I start drilling down to a small subset, say 50- 100 rows and then it really starts to grind.

This is the first time I've tried this method rather than disparate facts and a link table

0 Likes
356 Views

The bridge table can indeed become very large and this can affect performance.

However, in your case it sounds like something else. First, 45m records is not that much for modern computer with plenty of RAM. Secondly, you say that it is slower when you drill down to a small subset. It's usually the other way around: When you have reduced it to a small subset, it's faster.

So, I really can't say what it could be, without looking at the app.

HIC

0 Likes
356 Views
adamdavi3s
Honored Contributor

Hi Henric,


Thank you for coming back to me.

You're quite right and it was something else, a calculated dimension.

I thought it weird that the system slowed up as selections were made as this wasn't behavior I had experienced before.

When asking on the community what the issue could be, someone had suggested that it could be the size of the bridge table and the way the calculation engine was working, hence the question, but I am pleased to say that the app is now running very smoothly and I am really liking this method of working with dates.

356 Views
Not applicable

Hi HIC,

I have 22 different dates within my data model spread across Targets, Working Leads, Accounts, Marketing Funds, and Sales. All dates can relate back to an Account. I am wondering what my table would be the would correspond to the OrderLines Table? This way I can create my DateBridge Table? I have created master calendars for all 22 dates (seems excessive, but you never know what people will want to filter on). Also since they all can relate back to an account they may want to be looked at over the same time period (i.e. Accounts Lost verse Accounts Sold or Account Sales verse Account Marketing Funds - over time)..I am also working within Qlik Sense if this makes a difference.

Thanks

Madison

0 Likes
356 Views

Can an account have several possible values of a specific date, e.g. the sales date? If so, Account is not the right field to use.

You need to find the table that is the "lowest" one in the hierarchy - a table where a record never links to more than one "parent" (or record in another table). This table is a good candidate to use for the date bridge.

Whether it is QlikView or Qlik Sense doesn't matter. Same logic.

HIC

0 Likes
356 Views
Not applicable

hic‌, how do I join dates across 2 tables for a CanonicalDate. For example in my example, I have 3 date fields, 2 in one table and 3rd in 2nd table. How do I join all three to create a canonical date? I am able to do the dates in the same table, but fail with combining the third one. When I am trying to add the 3rd one , it will give a circular reference error because it is based on a different field.


This is my script:



Associate2AssociateBirthDate:

  Mapping Load Associate,AssociateBirthDateTime Resident [Associate];

Associate2AssociateReadTime:

  Mapping Load Associate,AssociateReadDateTime Resident [Associate];

Analyst2AnalystBirthDate:

  Mapping Load Analyst,AnalystBirthDateTime Resident [Analyst];

  

DateBridge:

Load Associate,ApplyMap('Associate2AssociateBirthDate',Associate,null()) as CanonicalDate

Resident [Associate];

Load Associate,ApplyMap('Associate2AssociateReadTime',Associate,null()) as CanonicalDate

Resident [Associate];

// Load Analyst,ApplyMap('Analyst2AnalystBirthDate',Analyst,null()) as CanonicalDate

// Resident [Associate];

The two tables are (1) Associate with fields Associate, AssociateBirthDateTime, AssociateReadDateTime

and (2) Analyst with AnalystBirthDateTime

The AnalystBirthDateTime is the one I am unable to put in the same CanonicalDate with the other two date fields.

356 Views

How are the fields Associate and Analyst linked?

If they are linked through a fact table, e.g.

Image1.png

then you should perhaps use the TransactionID as key for the date bridge:

DateBridge:

Load TransactionID,

  ApplyMap('Associate2AssociateBirthDate',Associate,null()) as CanonicalDate,

  'AssociateBirthDate' as DateType

  Resident [Facts];

Load TransactionID,

  ApplyMap('Associate2AssociateReadTime',Associate,null()) as CanonicalDate,

  'AssociateReadTime' as DateType

  Resident [Facts];

Load TransactionID,

  ApplyMap('Analyst2AnalystBirthDate',Analyst,null()) as CanonicalDate,

  'AnalystBirthDate' as DateType

  Resident [Facts];

HIC

356 Views
Not applicable

hic, there isn't a table in between really. the 2 tables are linked directly. how do we proceed in this case?

qlik-2016-08-19_13-44-46.png

Thanks!

0 Likes
356 Views

Does this mean that an Associate only can have one Analyst associated?

If so, you should use

DateBridge:

Load Associate,

  AssociateBirthDateTime as CanonicalDate,

  'AssociateBirthDate' as DateType

  Resident [Associate];

Load Associate,

  AssociateReadDateTime as CanonicalDate,

  'AssociateReadTime' as DateType

  Resident [Associate];

Load Associate,

  ApplyMap('Analyst2AnalystBirthDate',Analyst,null()) as CanonicalDate,

  'AnalystBirthDate' as DateType

  Resident [Associate];

356 Views

...and if AssociateBirthDateTime is a timestamp, you should probably use

Date(Floor(AssociateBirthDateTime )) as AssociateBirthDate

instead. And the same for AssociateReadDateTime.

HIC

356 Views
Not applicable

i got there but a synthetic key is being created. why? and what is a synthetic key and what does it do here exactly? the data seems to be coming through "correctly" , but this is a small test data i setup...qlik-2016-08-19_14-57-07.png

356 Views
a_isakov
New Contributor

Thanks a lot!

0 Likes
356 Views

The "Associate-1" is redundant and should be removed. Then the synthetic key will disappear too.

HIC

356 Views
sanjyotpatkar
Contributor III

Hello Henric

I have used the above example that you had pasted in the sample qvw file and created a canonical calendar to link to the Date Raised and Date Resolved fields in my extract as below. The sub routine for generating calendar is exactly the same pasted in your qvw file.

DateLink:
// TR Extract Date Raised
LOAD
[TR #]
,
DateRaised as Date // Fact Date
,'Created' as DateType // Fact Type
RESIDENT Resident_TR_Dump
;

// TR Extract Date Resolved

LOAD
[TR #]
,
DateResolved as Date
,'Resolved'
as DateType
RESIDENT Resident_TR_Dump
;

CALL CalendarFromField('Date', 'CommonCalendar', '');
CALL CalendarFromField('DateRaised', 'CreatedCalendar', 'Created ');
CALL CalendarFromField('DateResolved', 'ResolvedCalendar', 'Resolved ');

Besides the Created and Resolved tickets I need to calculate the Outstanding tickets.

Outstanding tickets are all tickets with Status ='Open'.

I have used the Weekstart for the X axis and use the following expression count(DISTINCT{<Status={'Open'},[Referred to CPS]={'Y'}>}[TR #]) . Essentially it needs to show all tickets that are having status ='Open' even if there is a resolved date for all successive weeks until the Status is changed to Closed as of date. so if a ticket has a Open status that was created in 2015 it needs to reflect in the weekstart of 13 Oct 2016.

Issue that I am facing is that ticket is reflected only in the week that it has been created and in the week that it has been resolved. It is not being considered in the missing weeks between nor in subsequent weeks till date. Will you be able to assist me or guide me in the right direction?Capture.JPG

0 Likes
356 Views

You need to create reference dates for all dates between the creation date and the resolved date. See Creating Reference Dates for Intervals

It may be that you don't need the canonical date, but instead should use the table with reference dates only.

HIC

0 Likes
356 Views
robert99
Valued Contributor II

"The canonical calendar fields are excellent to use as dimensions in charts, but are somewhat confusing when used for selections. For this, the fields from the standard calendars are often better."

I don't agree with this comment about selections. Has Henric had a change of heart on this (Feb 2014 post). Or does anyone else have any views on this?

I much prefer just having one calendar that can be used to make all selections particular using SENSE self service. I will now do everything I can to link all the date fields to one canonical calendar. Its so much easier for the users to understand. And as I set up all the expressions I just ensure they are correctly set up and labelled for the various dates. Like Rev£Invoiced and another one for Rev£Orders. With only one Calendar (Month, Year etc)

At times I have not been able to do a canonical date for all the required dates eg there is no suitable table with all the required attributes. Even then I have used a canonical date for the tables where I could. And then used a Date Island approach using the CanonicalDate. When needed I only create calendar fields for dimensions needed. I label then so it discourages users to use them for selections (eg MthXYZ.DontUseSelections and rename in the visualisation)

I've done this because of the number of times users just make a wrong date selection. I've done it myself when a hurry and curse and wonder why the figures make no sense.

Is my approach unwise or not? For me canonical dates are brilliant because unlike a date island approach they can be used for both selections and dimensions. They are harder to set up though and care is needed.

Thanks for any comments on this. I'm certainly not a Qlik script expert.   

0 Likes
356 Views
mov
Esteemed Contributor III

"Or does anyone else have any views on this?"

Here is a recent real life situation (simplified).

There are Orders with order date, estimated shipping date, and actual shipping date.  User wants three separate charts - number of orders by each of the above dates.

If use canonical date, when you select, for example, year 2015, you get 2015 on each chart - but it will be about different set of Orders.  An Order could be placed in 2014 and scheduled to ship in 2015.  And, shipped in 2016...  But users don't realize that, unless you announce it loudly.  And, when you do, they say, "No, no, I want the same Orders to be used in all three charts".  Hence, they needed three separate calendars.  So, if they select 2015 for Estimated Shipping date, they will see orders that were placed in 2014 or were shipped in 2016.

356 Views

I probably have changed my mind on some issues, but in this question I think I have consistently spoken in favor of using multiple calendars for the selections.

If you use a canonical date as a general date selection, you will sooner or later end up in a situation where the user has selected a specific quarter/month/week but he will still see orders that were placed before his selection. E.g. He selects Oct 2016 but still sees orders placed in May 2016. The reason is of course that some orders weren't shipped until October.

This is a correct result given the data model, but it will be very confusing for a business user. This is why I advocate that the different dates should be displayed in different list boxes. Only then will it be obvious to the business user what goes on.

0 Likes
356 Views
sanjyotpatkar
Contributor III

Thanks and I did find the article useful. I still have few queries.

1) Please can you direct me to the master calendar script containing Referencedate as the Reference date from the Policies_x_Dates links to the Master Calendar. I wasn't able to find the Master Calendar script for it in the blog..

2) I have multiple tables that I am loading data.I had used the canonical calendar script since I had to depict opened vs resolved  from 4 tables on one time axis.each of these tables have different field names for Data Opened and Date Closed. It worked fine until Outstanding count was not expected

How do i then modify Referecedate script to include multiple Opened and Resolved dates from different tables on one X axis?

0 Likes
356 Views