Qlik Community

Qlik Design Blog

All about product and Qlik solutions: scripting, data modeling, visual design, extensions, best practices, etc.

Employee
Employee

Canonical Date

A common situation when loading data into QlikView is that the data model contains several dates. For instance, in order data you often have one order date, one required date and one shipped date.

 

Base model.png

 

This means that one single order can have multiple dates; in my example one OrderDate, one RequiredDate and several ShippedDates - if the order is split into several shipments:

 

Logic 1.png

 

So, how would you link a master calendar to this?

 

Well, the question is incorrectly posed. You should not use one single master calendar for this. You should use several. You should create three master calendars.

 

The reason is that the different dates are indeed different attributes, and you don’t want to treat them as the same date. By creating several master calendars, you will enable your users to make advanced selections like “orders placed in April but delivered in June”. See more on Why You sometimes should Load a Master Table several times.

 

Your data model will then look like this:

 

Model with spec calendars.png

 

But several different master calendars will not solve all problems. You can for instance not plot ordered amount and shipped amount in the same graph using a common time axis. For this you need a date that can represent all three dates – you need a Canonical Date. This is how you create it:

 

First you must find a table with a grain fine enough; a table where each record only has one value of each date type associated. In my example this would be the OrderLines table, since a specific order line uniquely defines all three dates. Compare this with the Orders table, where a specific order uniquely defines OrderDate and RequiredDate, but still can have several values in ShippedDate. The Orders table does not have a grain fine enough.

 

This table should link to a new table – a Date bridge – that lists all possible dates for each key value, i.e. a specific OrderLineID has three different canonical dates associated with it. Finally, you create a master calendar for the canonical date field.

 

Full model.png

 

You may need to use ApplyMap() to create this table, e.g. using the following script:

 

     DateBridge:
     Load
          OrderLineID,
          Applymap('OrderID2OrderDate',OrderID,Null()) as CanonicalDate,
          'Order' as DateType
          Resident OrderLines;

     Load
          OrderLineID,
          Applymap('OrderID2RequiredDate',OrderID,Null()) as CanonicalDate,
          'Required' as DateType
          Resident OrderLines;

     Load
          OrderLineID,
          ShippedDate as CanonicalDate,
          'Shipped' as DateType
          Resident OrderLines;

 

If you now want to make a chart comparing ordered and shipped amounts, all you need to do is to create it using a canonical calendar field as dimension, and two expressions that contain Set Analysis expressions:

 

     Sum( {$<DateType={'Order'}>} Amount )
     Sum( {$<DateType={'Shipped'}>} Amount )

 

Bar chart.png

 

The canonical calendar fields are excellent to use as dimensions in charts, but are somewhat confusing when used for selections. For this, the fields from the standard calendars are often better.

 

Summary:

  • Create a master calendar for each date. Use these for list boxes and selections.
  • Create a canonical date with a canonical calendar. Use these fields as dimension in charts.
  • Use the DateType field in a Set Expression in the charts.

 

A good alternative description of the same problem can be found here. Thank you, Rob, for inspiration and good discussions.

 

HIC

142 Comments
datanibbler
Esteemed Contributor


very good post! Just a bit hard to understand for "the rest of us". I'll have to read it once more. Right now, I have only one scenario for that - where I currently use two master_calendars - but going forward, there are sure to be more.

Best regards,

DataNibbler

1,134 Views
Not applicable

very clever

Thanks for so good explanation

Fabrice

0 Likes
1,134 Views
Not applicable

As always Great! Useful Past !

Good Explanation.

Regards,

Divya

0 Likes
1,134 Views
peschu123
Contributor III

Thank you for summarizing this in a post.

I have several dates in almost all my reports and it was one of the first (harder) things when I got in contact with QV.

An example how you use it in the user interface would be great. I often experience that users don't understand the usage or the methodology behind it. Perhaps I make it to complicated.

Another point I would add is how to handle forecast values. There are often "Planned shipment date" and/or a "Shipped Date" for example. Then you will have too much calenders. I mostly merge them to a field like "Shipment Date". ( with a Flag field "forecast" (yes,no) )

This lets you use one dimension in the chart even with forecast values.

0 Likes
1,134 Views
Not applicable

Hey HIC,

Great post as alway !

Somehow in most of the discussion with the client, I find they are keen to have just one calendar (rather than having separate calendars for the disparate important dates).

Thanks,

SK

0 Likes
1,134 Views
Not applicable

very good post

0 Likes
1,134 Views
dvqlikview
Honored Contributor II

Hi HIC,

Thank you for another great post and your clarity of thought is very good. I just wanted to clarify two things here.

1. Reading the below script you meant "CanonicalDate" instead of "Date" as the field name? Or am I missing something here...

     DateBridge:

     Load OrderLineID, Applymap('OrderID2OrderDate',OrderID,Null()) as Date, 'Order' as DateType

          Resident OrderLines;

     Load OrderLineID, Applymap('OrderID2RequiredDate',OrderID,Null()) as Date, 'Required' as DateType

          Resident OrderLines;

     Load OrderLineID, ShippedDate as Date, 'Shipped' as DateType

          Resident OrderLines;

2. Do we need the "Distinct" prefix while loading these tables?

Thanks,

DV

1,134 Views
Employee
Employee

1. You are of course right. It should be "CanonicalDate".

2. It could be useful, but it wouldn't make a big difference. Duplicates can only occur if several order lines are shipped on the same day. And a distinct clause would remove these.

HIC

1,134 Views
dvqlikview
Honored Contributor II

Thank you HIC. Cheers!

0 Likes
1,134 Views
sundarakumar
Valued Contributor II

Great Post... Thanks a lot...

0 Likes
1,134 Views
Employee
Employee

@ sk123456 I know you are right about clients wanting to have one calendar only, but I still think that it is a mistake to deliver an app with canonical date only. It is an over-simplification of the problem that leads the user to think that the data model is simpler than it really is.

Users will never understand what a selection in the canonical date really means. They will - sooner or later - ask questions like "Why are there still transactions from February possible, when I have selected March only?" And this will be more difficult to explain than to include several master calendars from the start.

HIC

1,134 Views
mov
Esteemed Contributor III

Interesting idea.
In the similar situations, I usually create "common" calendar as a data island.  The Rob/HIC solution should have a better performance, and it allows more options for the end users.

Thanks,
Michael

1,134 Views
carbal1952
Contributor II

Thanks HIC:

I used this solution from long ago but i always thought it wasn't very "academical".

This post reinforces my idea.

CB.


0 Likes
1,134 Views
michael_solomon
New Contributor III


Thank HIC, great post again, but...

The date bridge is essentially a concatenation of the granular record X number of times where X is the number of date fields you have, so becomes an extra fact table with more records than the original fact tables Orders and OrderLines. This leaves you with another factor in deciding how much to normalise/denormalise the fact tables. ie. a single fact table will require merging the fact tables (see below), but then have to load in each record X times. Or am I missing something?

Common Date Dimension.JPG.jpg

1,134 Views
kevinpintokpa
Contributor II

A very well written article!

It may be helpful to provide information including the script command to generate the two mapping tables OrderID2OrderDate and OrderID2RequiredDate.

1,134 Views
valerio_fatatis
New Contributor III

g o o d ...

0 Likes
1,134 Views
michael_gardner
Contributor III

Kevin,


I think the two mapping tables should be like below, though I am not 100% sure so please correct me if I am wrong.

OrderID2OrderDate:

MAPPING LOAD

OrderID,

OrderDate

RESIDENT Orders;

OrderID2RequiredDate:

MAPPING LOAD

OrderID,

RequiredDate

RESIDENT Orders;

0 Likes
1,134 Views
Employee
Employee

@ Michael Solomon    I would not join the two tables. Then the same Amount will end up on three records and the sum will be incorrect. But apart from that you are right that the DateBridge might become a long table.

@ Kevin Pinto and @ Michael Gardner    Michael is spot on with his suggestion on the mapping tables. See more on http://community.qlik.com/blogs/qlikviewdesignblog/2012/09/18/one-favorite-function-applymap

HIC

1,134 Views
vinodh_qlikview
New Contributor

Great post HIC..........Thanks

0 Likes
1,134 Views
Not applicable

Henric Cronström  Yes, I agree with you and somehow it has always been difficult for me to convince them right away.

But now I have your post (and your reply specifically!) to refer to them for better clarity and details....

SK

0 Likes
1,134 Views
michael_solomon
New Contributor III

Yes you're right, it would simply force you to always need either Set Analysis or an IF statement on the DateType.

0 Likes
1,134 Views
bill_mtc
Contributor

This post is GREAT! I might have changed the current data model. Thanks for the new ideas.

0 Likes
1,134 Views
Not applicable

This is interesting! I was recently at your developer course in Stockholm were we had the exact problem, an OrderDate and a ShipmentDate. The solution according to the exercise "Joining Facts and Shipments", the solution is to use a common Date field and a "Transtype" field that defines if the Date is a ShipmentDate or an OrderDate.

qv-dev-159.JPG.jpg

I find it interesting that the course teaches the exact opposite of what you're proposing. Your solution seems to be the most viable though.

0 Likes
1,134 Views
Not applicable

I often find the same problem with solutions Carl. I really like HIC approach and where I have always used link tables for multiple dates, I now understand the short-comings of that and I what I am able to provide to my users now that I have split the date attributes as such

Thank you very much for this HIC

Cheers,

Byron

0 Likes
1,134 Views
Employee
Employee

@ Carl-Fredrik Herö  The two solutions are not that different. For instance, the field "TransType" is in principal the same as my "DateType". And whether you can join the tables or not, depends on the data. 

I must admit that I haven't looked at the training material. I will do that.

HIC

0 Likes
1,134 Views
Not applicable

Thanks Henric! The difference between the solution would be the use of multiple calendars, but I guess, as you say, it depends on whether you need select both Order date and Shipment date or not.

0 Likes
1,134 Views
Employee
Employee

@ Carl-Fredrik Herö

I would say that in the end it will depend on the level of complexity of the overall data model. In the example, that they post in the training book, they only have 2 time dimensions so the approach suggested works very nicely. However, in the case that you have 6-7 or more different time dimensions, then I would say that the approach suggested by Henric would work much better.

My $0.02

0 Likes
1,134 Views
MVP
MVP

Carl-Fredrik Herö : It's important to understand that the data model used in the training materials, is teaching a technique of concatenating multiple fact tables into a single fact. This is a very important data modeling technique and it needs to be presented just the way it is.

For a concatenated "single fact" structure, it's common to rename the main date field for each "slice" of data, and call all the "main" data fields with the same name. Notice that each slice of the concatenated fact, has only one "main" date field.

However, if you had to deal with multiple date fields for several tables, like in HIC's example, you'd have to use the technique that HIC had described.

The two solutions don't contradict each other, these are simply two different techniques used for two different data models.

Oleg Troyansky

www.masterssummit.com

0 Likes
1,134 Views
Not applicable

Nice one.

If it was just about the date and time.

In such cases you will properly also have different order, shipped and destination adresses  and perhaps pitstops on the way, how is the the correct database setup if you also have the standard fields like zip, city, region and country? I mean in relation to data redundancy, normalization etc..

0 Likes
1,134 Views
Employee
Employee

You are absolutely right that it (in theory) is the same problem with Addresses, Zip codes, Countries, etc. But there is a difference in how these fields are perceived: A user usually accepts several fields here: CustomerAddress, ShipperAddress and SupplierAddress. But the user does not always accept three date fields.

Anyway, should you want a "generic" address for Customers, Shippers and Suppliers, then you should introduce a "Canonical Address" exactly as described in this blog post.

On data redundancy: Many developers try to always avoid data redundancy, but I am convinced that this is a huge mistake. At least for BI applications where the only goal is to make it easy for a user to learn from data. This is very different from designing databases where you have an explicit goal to avoid data redundancy since it would cause major problems. In a BI application however, data redundancy doesn't cause problems, since both copies of data are fed from the same source. No, in BI, data redundancy sometimes solves problems!

HIC

1,134 Views