Qlik Community

Qlik Design Blog

All about product and Qlik solutions: scripting, data modeling, visual design, extensions, best practices, etc.

The Master Time Table

The Master Calendar table is a central component in many QlikView applications: It is a dimension table listing different calendar attributes such as Year, Month, Day, etc.

But what about time attributes, such as hours, minutes and seconds? How should these be handled? Should these also be included in the Master Calendar? Or should you create a Master Time table?

 

 Often you should create a separate table for the Time dimension. To understand why, we need to look at the cardinality of the field used as key, i.e. the number of possible values. If a date is used as key, there can be at most 366 unique values per year. Hence, the Master Calendar will have at most 366 records per year. This makes the Master Calendar a small, efficient dimension table.

Time cardinality per year.png

 

But if we instead use a timestamp as key, we have a different situation. A timestamp rounded to the nearest second will have over 30 million possible values per year. And if it has milliseconds too, it becomes even worse… A timestamp usually has almost the same number of unique values as the transaction table has records. It goes without saying that such a timestamp is inappropriate as dimension key. The dimension table would in many cases become just as big as the transaction table and nothing would be gained.

 

So, what should we instead do?

Time cardinality per day.png

 

One good way is to convert the timestamp to two keys: Date and Time. The Date key needs to be truncated to the nearest integer so that no time information remains. For the Time key, it's the other way around: The integer part of the number needs to be removed so that no date information remains. In addition, it should be truncated - preferably to the nearest minute or 10-second interval - to keep the cardinality down.

 

These keys can be created through:

 

     Date(Floor(Timestamp)) as Date
     Time(Floor(Frac(Timestamp),1/24/60),'hh:mm') as Time

 

This way you will be able to have two master tables, one with 366 records per year, and one with perhaps 1440 records – both tables small and efficient. The Master Time table can have fields for hours, minutes and e.g. work shifts and can easily be created from the above key.

 

However, you lose information when you remove the information about seconds and milliseconds. So, you need to ask yourself whether you need this information or not. If you do, the best option is often to keep the original timestamp in the transaction table, in addition to the created keys, so that it can be used to calculate different measures. If not, just don’t load the original timestamp.

 

Summary:

 

  • Think of the cardinality when you create dimensions.
  • Make sure that the key to the Master Calendar table is an integer (formatted as a date): Use the Floor() function to remove the time component.
  • If you need hour and minute as fields, create a Master Time table.
  • Make sure that the key to the Master Time table is the fractional part of the timestamp, truncated to e.g. the nearest minute: Use a combination of the Floor() and the Frac() functions to create it.

 

HIC

17 Comments
mov
Esteemed Contributor III

Thanks Henric,

Another advantage is that the long timestamp field is replaced with two much shorter fields, Date, and Time.  It noticeably reduces the application size on disk and in memory.

Regards,

Michael

757 Views
jjordaan
Valued Contributor

Ths again for a great post!

I now explicit format the key to the mastercalendar with the num function so Num([Posting Date]) AS %Date.

What I understand now is that it is better to format it to date?

Thx in advance

0 Likes
757 Views

Whether it is formatted as number or as date is really not important. Both will work equally well. The important thing is that it is truncated to be an integer.

(But I would still format it as a date, since that is what it is.)

HIC

0 Likes
757 Views
jjordaan
Valued Contributor

Thanks Henric

0 Likes
757 Views
Not applicable

Would you autogenerate as you do in the master calendar?

0 Likes
757 Views

Yes, I would. The option of using Load distinct can take time...

HIC

0 Likes
757 Views
Not applicable

Ahh then i think i need some help, causes it is giving me some errors.

Garbage after statement

TimeField:

Load

    date(0,27152777777778 + RowNo() -1) as TempTime

AutoGenerate

    0,29027777777778 - 0,27152777777778 +1

-- And im doing almost like mastercalendar:

///******Create min/max variables*********

Let vMinTime = num(peek('Time', 0, 'Hej'));

Let vMaxTime = num(peek('Time', -1, 'Hej'));

////*******Temp Calendar******

TimeField:

Load

    date($(vMinTime) + RowNo() -1) as TempTime

AutoGenerate

    $(vMaxTime) - $(vMinTime) +1;

////   

MasterCalendarTime:

Load

TempTime as Time,

Hour(TempTime) as Week,

Minute(TempTime) as Year

RESIDENT TimeField

Order by TempTime ASC;

DROP table TimeField;

0 Likes
757 Views

The script needs decimal point, and $(vMaxTime) is expanded with a decimal comma. You should use the numeric expansion $(#vMaxTime) instead. The numeric expansion $(# ... ) always generates a decimal point.

Further, I would use an integer as increment in the Load, e.g.

Load

     Time(15SecondPeriod / 24 / 60 / 4 ) as Time;

Load

     RecNo() as 15SecondPeriod

     Autogenerate ... ;

HIC

0 Likes
757 Views
Not applicable

You confused me even more now Where do you want the integer load? And how or what or ...i dont understand

0 Likes
757 Views

Sorry, I should be clearer... The Autogenerate should in my mind be used using RecNo() (integer) that defines the smallest interval in your dimension, e.g.

Load

     RecNo() as 15SecondPeriod

     Autogenerate 24 * 60 * 4 :

so "15SecondPeriod" is an integer sequence number that you use to define further fields. In your case, you use non-integers after the "Autogenerate" keyword, and then it becomes confusing.

HIC

0 Likes
757 Views
Not applicable

Ahhh - but why do you wanna divide it by 4 ? Then you get too many values dont you? If i remove the multiply and divide by 4 i get my exact numbers i want in my dimension.

But then my chart is showing almost all my data entries at the 23 hour (and it should not)

0 Likes
757 Views

My code will create one entry per 15-second interval. Then you need

     24 * 60 * 4

intervals per day. And the timestamp you can create by

     Time( RecNo() / 24 / 60 / 4 )

If all your data entries are around 23:00, then you have a numerical problem. Format the field as number and see what you get. A time should always be between 0 and 1.

HIC

0 Likes
757 Views
Not applicable

I see your point - Its good to make 15second intervals so you dont get a row for each second.

However i making this on Hour/minutes only - So i remove the 4. But still i get totally different results. If i use this method i get 407 values which hits null.

If i do like this instead though, i get all my values correct (but the strange thing is that it contains the excact same values as yours)

//MasterCalendarTime:

//Load distinct

//Time as Time,

//Hour(Time) as Hour,

//Minute(Time) as Minute

//

//RESIDENT Hej

//

//Order by Time ASC;

However i did notice that if i run this Time( RecNo() / 24 / 60) its shows the stamp as 20:01:00 where my TimeKey is 20:01 - Could this be an issue?

0 Likes
757 Views

Why you get values with NULL, I cannot say without seeing data.

The formatting - 20:01:00 or 20:01 - is just formatting. It should not matter. If you want to, you should use a formatting function:

     Time(Time,'hh:mm') as Time

HIC

0 Likes
757 Views
Not applicable

Okay, thank you for the help anyways Ill keep this going on in a forum post instead. Ill post a document in there you can see if it is Thanks

0 Likes
757 Views
isingh30
Valued Contributor

Thank you Henric

0 Likes
757 Views
bbmmouha
Contributor

@thomas Jensen

hello

i have the same problem as you

do you have the solution

if it's the case may you share it

thank you

0 Likes
757 Views