Skip to main content
Announcements
Live today at 11 AM ET. Get your questions about Qlik Connect answered, or just listen in. SIGN UP NOW
hic
Former Employee
Former Employee

 

A common situation when loading data into a Qlik document is that the data model contains several dates. For instance, in order data you often have one order date, one required date and one shipped date.

 

Base model.png

 

This means that one single order can have multiple dates; in my example one OrderDate, one RequiredDate and several ShippedDates - if the order is split into several shipments:

 

Logic 1.png

 

So, how would you link a master calendar to this?

 

Well, the question is incorrectly posed. You should not use one single master calendar for this. You should use several. You should create three master calendars.

 

The reason is that the different dates are indeed different attributes, and you don’t want to treat them as the same date. By creating several master calendars, you will enable your users to make advanced selections like “orders placed in April but delivered in June”. See more on Why You sometimes should Load a Master Table several times.

 

Your data model will then look like this:

 

Model with spec calendars.png

 

But several different master calendars will not solve all problems. You can for instance not plot ordered amount and shipped amount in the same graph using a common time axis. For this you need a date that can represent all three dates – you need a Canonical Date. This is how you create it:

 

First you must find a table with a grain fine enough; a table where each record only has one value of each date type associated. In my example this would be the OrderLines table, since a specific order line uniquely defines all three dates. Compare this with the Orders table, where a specific order uniquely defines OrderDate and RequiredDate, but still can have several values in ShippedDate. The Orders table does not have a grain fine enough.

 

This table should link to a new table – a Date bridge – that lists all possible dates for each key value, i.e. a specific OrderLineID has three different canonical dates associated with it. Finally, you create a master calendar for the canonical date field.

 

Full model.png

 

You may need to use ApplyMap() to create this table, e.g. using the following script:

 

     DateBridge:
     Load
          OrderLineID,
          Applymap('OrderID2OrderDate',OrderID,Null()) as CanonicalDate,
          'Order' as DateType
          Resident OrderLines;

     Load
          OrderLineID,
          Applymap('OrderID2RequiredDate',OrderID,Null()) as CanonicalDate,
          'Required' as DateType
          Resident OrderLines;

     Load
          OrderLineID,
          ShippedDate as CanonicalDate,
          'Shipped' as DateType
          Resident OrderLines;

 

If you now want to make a chart comparing ordered and shipped amounts, all you need to do is to create it using a canonical calendar field as dimension, and two expressions that contain Set Analysis expressions:

 

     Sum( {$<DateType={'Order'}>} Amount )
     Sum( {$<DateType={'Shipped'}>} Amount )

 

Bar chart.png

 

The canonical calendar fields are excellent to use as dimensions in charts, but are somewhat confusing when used for selections. For this, the fields from the standard calendars are often better.

 

Summary:

  • Create a master calendar for each date. Use these for list boxes and selections.
  • Create a canonical date with a canonical calendar. Use these fields as dimension in charts.
  • Use the DateType field in a Set Expression in the charts.

 

A good alternative description of the same problem can be found here. Thank you, Rob, for inspiration and good discussions.

 

HIC

192 Comments
michael_solomon
Partner - Contributor III
Partner - Contributor III

Yes you're right, it would simply force you to always need either Set Analysis or an IF statement on the DateType.

0 Likes
25,003 Views
bill_mtc
Partner - Creator
Partner - Creator

This post is GREAT! I might have changed the current data model. Thanks for the new ideas.

0 Likes
25,003 Views
Not applicable

This is interesting! I was recently at your developer course in Stockholm were we had the exact problem, an OrderDate and a ShipmentDate. The solution according to the exercise "Joining Facts and Shipments", the solution is to use a common Date field and a "Transtype" field that defines if the Date is a ShipmentDate or an OrderDate.

qv-dev-159.JPG.jpg

I find it interesting that the course teaches the exact opposite of what you're proposing. Your solution seems to be the most viable though.

0 Likes
25,003 Views
Not applicable

I often find the same problem with solutions Carl. I really like HIC approach and where I have always used link tables for multiple dates, I now understand the short-comings of that and I what I am able to provide to my users now that I have split the date attributes as such

Thank you very much for this HIC

Cheers,

Byron

0 Likes
25,003 Views
hic
Former Employee
Former Employee

@ Carl-Fredrik Herö  The two solutions are not that different. For instance, the field "TransType" is in principal the same as my "DateType". And whether you can join the tables or not, depends on the data. 

I must admit that I haven't looked at the training material. I will do that.

HIC

0 Likes
24,720 Views
Not applicable

Thanks Henric! The difference between the solution would be the use of multiple calendars, but I guess, as you say, it depends on whether you need select both Order date and Shipment date or not.

0 Likes
24,720 Views
Jesús_Centeno
Employee
Employee

@ Carl-Fredrik Herö

I would say that in the end it will depend on the level of complexity of the overall data model. In the example, that they post in the training book, they only have 2 time dimensions so the approach suggested works very nicely. However, in the case that you have 6-7 or more different time dimensions, then I would say that the approach suggested by Henric would work much better.

My $0.02

0 Likes
24,720 Views
Oleg_Troyansky
Partner Ambassador/MVP
Partner Ambassador/MVP

Carl-Fredrik Herö : It's important to understand that the data model used in the training materials, is teaching a technique of concatenating multiple fact tables into a single fact. This is a very important data modeling technique and it needs to be presented just the way it is.

For a concatenated "single fact" structure, it's common to rename the main date field for each "slice" of data, and call all the "main" data fields with the same name. Notice that each slice of the concatenated fact, has only one "main" date field.

However, if you had to deal with multiple date fields for several tables, like in HIC's example, you'd have to use the technique that HIC had described.

The two solutions don't contradict each other, these are simply two different techniques used for two different data models.

Oleg Troyansky

www.masterssummit.com

0 Likes
24,720 Views
Not applicable

Nice one.

If it was just about the date and time.

In such cases you will properly also have different order, shipped and destination adresses  and perhaps pitstops on the way, how is the the correct database setup if you also have the standard fields like zip, city, region and country? I mean in relation to data redundancy, normalization etc..

0 Likes
24,720 Views
hic
Former Employee
Former Employee

You are absolutely right that it (in theory) is the same problem with Addresses, Zip codes, Countries, etc. But there is a difference in how these fields are perceived: A user usually accepts several fields here: CustomerAddress, ShipperAddress and SupplierAddress. But the user does not always accept three date fields.

Anyway, should you want a "generic" address for Customers, Shippers and Suppliers, then you should introduce a "Canonical Address" exactly as described in this blog post.

On data redundancy: Many developers try to always avoid data redundancy, but I am convinced that this is a huge mistake. At least for BI applications where the only goal is to make it easy for a user to learn from data. This is very different from designing databases where you have an explicit goal to avoid data redundancy since it would cause major problems. In a BI application however, data redundancy doesn't cause problems, since both copies of data are fed from the same source. No, in BI, data redundancy sometimes solves problems!

HIC

24,720 Views