6 Replies Latest reply: Jan 27, 2011 2:51 PM by John Witherspoon RSS

    Best Practices for Date Hierachies

    Jonathan Shaltz

      Good afternoon. When I have a SQL date field, I'd like to use it as an intelligent hierarchy in QlikView. This doesn't seem to be default behavior, though. The best I've come up with so far is to create a group and populate it with three expressions: =Year(DateField), =Month(DateField), and =Day(DateField). This seems like more trouble than I'd expect for something so basic and universal. Everyone has dates, and they're used as drill-down hierarchies 99% of the time. Is there an easier way to achieve this?

      I could include the three fields in my SQL sproc instead, but that would increase the amount of data that QlikView has to load from the server, which is already time-consuming.

        • Best Practices for Date Hierachies
          John Witherspoon

          A typical QlikView practice here is to create a Calendar table. The calendar table would link to your data by DateField, and then would also contain the year, month, day, and whatever else you wanted. For a hierarchy, yes, you'd make a drill down group with Year, Month and Day.

          I agree that QlikView could handle dates better. However, I assume their philosophy here is to let you create the hierarchy you want, not the hierarchy they decided to give you. Even if 99% of the time we use date drill down hierarchies (which is certainly NOT the case at our shop), why should it be Year, Month and Day? Why not Year, Quarter, Month and Week? Did you want the calendar year or a fiscal year? If you want a Week in your hierarchy, how is Week to be defined - first day of the week? A week number? If a week number, are you using the ISO 8601 week number? ISO 8601 week number modified to start on Sunday? US week? QlikTech could have made a decision and given you a default date hierarchy, but given all the variations, it might only be useful in 10% of applications, and would involve fields generated by the system instead of generated by script. Basically, it would be a fairly fundamental violation of how the product works for something that might not be that useful to most people. "Ugh, stupid default QlikView date hierarchy. It never does what I want. I always have to build my own."

          Again, I think QlikView could handle dates better. I consider their handling of dates a weakness of the product, even if I think I understand why they handle dates the way they do. But if you're suggesting that QlikView should simply by default create a date hierarchy for you with Year, Month and Day without you even asking for it, I'll disagree strongly.

            • Best Practices for Date Hierachies
              Jonathan Shaltz

              Thanks for the reply. Hmm, that's disappointing. I already have a Dates table in my warehouse, I'll pull the extra fields and accept the additional querying time. I'd like to avoid adding new code to every script, decentralized solutions scare me. If the powers that be decide to start defining weeks to start on Sunday rather than Saturday, I'd rather not have to change dozens of separate files.

              I'm curious: how would you use dates, if not hierarchically? If there are just a few values it doesn't matter, sure, but surely it's passing rare to work with large quantities of dates and not use a hierarchy. As for varying hierarchies, that's certainly true. Marketing vs calendar quarters come to mind. There's not infinite variation, though. Most U.S. companies will use one of three or four common standards (calendar months, October FY, etc.) and most European companies one of a few more.

              The simple solution is for QlikView to have built-in intelligence regarding dates, like a virtual table with fields for Year, Quarter, etc. Have YQMD be the default date hierarchy and let users choose another from a list of presets. Better to be convenient for some customers (I'd bet 80%, not 10%) than none, and there would be nothing stopping users from creating custom hierarchies if necessary. Consider the Text() function: because of regional variations, it produces the "wrong" format for many, maybe even most customers, but it's better than none at all.

                • Best Practices for Date Hierachies
                  John Witherspoon

                  Maybe we're just using "hierarchy" differently. In QlikView, I consider a date hierarchy to be a drill down group, or in theory a hierarchical load, though I don't think I've ever seen someone use that for dates. I almost never use drill down groups. I have yet to use a hierarchical load.

                  But yes, obviously if I have Year, Quarter, Month and Date fields in my application, there's conceptually a natural hierarchy there. It would be silly to have nothing but Date fields. However, if I have a Week field, it doesn't necessarily fit into that hierarchy unless you force your weeks to start and end on month boundaries, which is probably very uncommon.

                  Most of our applications simply include a number of date fields to select from. So we might, for instance, have Year, Quarter, Month, Week and Date fields. In most cases, all of the lower level fields contain the year in them, so Month would be "Jan 2011" for instance. My Weeks are typically the start date of the week, with weeks starting on Sunday.

                  I often create a cyclic group of these date fields when I'm charting a time line, allowing users to see the time line at various levels of detail. But I stay away from drill down groups, because I want them to have separate control over what date range is shown in the chart.

                  So maybe I should be agreeing with you instead - for every Date field, a typical application will also have at least Year and Month fields. So maybe QlikView should generate those, and maybe Quarter automatically for you. On the other hand, I typically use months like "Jan 2011", while many other people typically use "January", leaving the year off. It could, I suppose, call mine "Year Month" or "Month Year", but now it's naming things for me, and what if I wanted my names to be simpler, and just use Month for "Jan 2011"? I suppose what we'd want is to get defaults by default, but be able to, say, remove a checkmark somewhere and generate your own date fields instead of accepting the QlikView defaults. Seems reasonable.

                  I think QlikTech's philosophy here is to treat all fields the same. I think that is a mistake. Specifically, I think they should have special handling for date information and regional information. Dates should perhaps have some default handling, some default dates generated. Also, when put on a timeline, I think dates should be treated as continuous data, and just because your data set doesn't include a sale on Jan 21, 2011 doesn't mean that that date simply doesn't exist. A calendar is continuous whether or not there are sales on any particular date. You can get around it, but QlikView makes it very difficult, because it's all about showing you the data you have, not the data you don't have. Regional information also seems like it deserves some special handling, with maps and other information. It seems silly to have to try to interact with Google maps, or to not have a function to calculate the distance between two points on a globe. To me, all fields are not equal. Dates and regions are different, and should be handled differently in my opinion. There may be other categories of fields like that as well.

                  In regards to your specific situation, using the dates table sounds better than recreating the data, even if it is easy to recreate the data. So I think you're on the right track. To avoid reloading this data all over the place, create a Calendar.qvw that loads in all the date information, supplements with any additional date-related data you might want in QlikView, and stores it to a Calendar.qvd. Then your user applications should load from Calendar.qvd instead of from the database. If you then want your default calendar, you could do a LOAD * FROM Calendar.qvd (QVD). I don't do load * because I want to control which fields an application uses, and not be affected if someone adds more fields to a QVD, but you could for maximum simplicity. I'd also probably set up a min and max date and load between those dates in a typical application. And some applications have multiple date fields, possibly requiring renaming as you read in the Calendar qvd. But yeah, that's how I'd approach it.

                    • Best Practices for Date Hierachies
                      Jonathan Shaltz

                      Interesting points, thanks for taking the time to cover all this.

                      I'm puzzled by one point: what's the distinction between "drill down groups" and "cyclic groups"? The only groups I've seen so far are just called "groups".

                      I quite agree regarding your other points on dates and geographic data. The concept of having hierarchies "just happen" is good, but there should be some intelligence in place for obvious natural hierarchies like these. I'm sure I'll have users puzzled by missing date values, too. It'd be confusing to have equal space in a chart devoted to Jan, Feb, Mar, and Nov, especially if users switch back and forth between views without closely watching changing labels.

                      So I'd load pre-computed, pre-compressed data from a single QlikView file, and get centralized maintenance with blazing performance. And the Calendar QVD can use the database's Dates table as its source, so changes there can be very easily propagated out. Perfect!

                      This is drifting further away from the original question, but based on your last suggestion it sounds like it might make sense to put each fact table (we have a full warehouse in SQL) into a QVD, so that "kitchen sink" queries can join off of these rather than trying to pull zillions of records from SQL. I would probably continue using sprocs for smaller summary reports that normally return few records. Does this make sense based on your experience, or am I heading in the wrong direction?

                        • Best Practices for Date Hierachies
                          Jonathan Shaltz

                          Looking more closely, I see the radio buttons for "cyclic" or "drill-down" now. I'll read up on these.

                          • Best Practices for Date Hierachies
                            John Witherspoon

                             


                            Jon of All Trades wrote:I'm sure I'll have users puzzled by missing date values, too.

                            It's a solvable problem, but the solution is complicated and far from obvious. The attached example covers the technique. It's an example of something that I think should be default behavior for dates, or at least selectable with a single obvious checkmark.

                             


                            Jon of All Trades wrote:This is drifting further away from the original question, but based on your last suggestion it sounds like it might make sense to put each fact table (we have a full warehouse in SQL) into a QVD, so that "kitchen sink" queries can join off of these rather than trying to pull zillions of records from SQL. I would probably continue using sprocs for smaller summary reports that normally return few records. Does this make sense based on your experience, or am I heading in the wrong direction?

                            Yes, this makes sense based on my experience. We load the raw data out of our databases, do typically minor but occasionally major transformations, and store the resulting QlikView tables in QVDs. It centralizes and standardizes how QlikView sources and looks at data, and makes sure that you don't hit the database any more often than you need to. These QVDs then basically grow into a sort of data warehouse. Most of our user applications just load from QVDs with no SQL in them at all, though sometimes for small tables used in a single application, yeah, we just stick in some SQL.

                            In your case, it sounds like you already have a data warehouse, so you won't likely need much transformation as you load into QlikView. But it will still be useful to centralize the data reads and only do them once no matter how many user applications use the data.

                            Edit: Some companies take this a step further. They'll create a QVW that combines a bunch of related tables from QVDs. Let's say everything you might want to know when looking at order data - order information, product information, price information, whatever. But that QVW will have no charts in it, and exists only to combine the data. Then they'll create various order applications by doing a binary load from that QVW. We don't typically do that here, as we tend to just have one application for a given area, a single orders application in this case, just with a lot of tabs serving a lot of needs. In one case, I've then fiddled with the application and QlikView publisher to generate separate user applications from there during distribution, so that different tabs are included in different user applications. I handled it that way in that case because several of the tabs were shared in common, and I didn't want to maintain the objects on those tabs in multiple places. Anyway, lots of ways to avoid duplicate maintenance.