As QlikView Developers, we are often asked to supply the source data files for applications on the demo.qlik.com site. Since most applications are able to be downloaded locally, I figured I would let you in on a little secret. You can create the data files yourself and it is a fairly simple process. The QlikView scripting statement that will help you achieve your desired results is the STORE statement.QlikView help states that a QVD or a CSV file can be created by a store statement in the script. The statement will create an explicitly named QVD or CSV file. The statement can only export fields from one logical table. The text values are exported to the CSV file in UTF-8 format. A delimiter can be specified, see Load. The store statement to a CSV file does not support BIFF export.Examples:Store mytable into xyz.qvd (qvd);Store * from mytable into xyz.qvd;Store Name, RegNo from mytable into xyz.qvd;Store Name as a, RegNo as b from mytable into xyz.qvd;store mytable into myfile.txt (txt);store * from mytable into myfile.txt (txt);(The two first examples have identical function.)So let’s walk through the process. 1. Open a new QVW and binary load the original app. 2. Identify the table(s) that you would like to extract. (Table viewer).3. In the script you will either have to enter a Store statement for every table that you would like to extract. Or you could use this snippet of script that will extract all tables from the original app4. Reload the application. After the reload, the extracted files will be placed in the same folder as the reloaded QVW. Now go ahead and try it for yourself. Happy Qlikking!
...View More
Hierarchies are very common in all database and business intelligence solutions. Usually they are balanced and with a fix number of levels, and then they do not pose any problems. Just load the data, add a drill-down group, and you’re done.
But there is one type of hierarchy that is somewhat tricky to get right – an unbalanced, n-level hierarchy. Typical for this type of hierarchy is that the levels are not named, and you really don’t know on which level you need to search for a specific node.
Usually such a hierarchy is stored in an Adjacent Nodes table, i.e. a table that has one record per node and each node has a reference to its parent.
Such a table can be loaded into QlikView directly using the Hierarchy prefix. This prefix will transform the Adjacent Nodes table into an Expanded Nodes table that has additional fields that you can use in your app.
With the fields in this table, you can easily create a pivot table and a tree-view list box. Below you can see some wine districts displayed in both these object types:
One challenge with hierarchies is that you can refer to a node in two different ways: Either to the node including the entire sub-tree, or to the node only, excluding all sub-nodes. In the example with the wine districts, it would mean any wine from Bordeaux, and unspecified Bordeaux, respectively. In the pivot table above, the difference is obvious: Any wine from Bordeaux sums up to 150 units, and the unspecified Bordeaux sums up to 18 units.
A user usually wants to make selections referring to the entire sub-tree, but the above solution does not have any field for this. To create such a field, you need the second hierarchy-resolving prefix – the HierarchyBelongsTo.
This prefix will also transform the hierarchy table. The result will be a table containing one record per descendant-ancestor pair. In other words, the ancestor (tree ID) will link to all its descendants (node ID), and can thus be used to make selections of entire sub-trees. (The “TreeBridge” table in the picture below.)
But it doesn’t stop here… The above solution creates one field in which tree searches can be made, but in order to create a drill-down for trees, you need an additional table – an expanded nodes table for the trees. This can be created with a second Hierarchy statement, but now one that links to the tree ID instead of the node ID. (The “Trees” table in the picture below.)
The data model with the three hierarchy tables is the one I recommend: It generates all fields you need.
A more elaborate explanation with script examples can be found in the technical brief about Hierarchies.
HIC
Further reading related to this topic:
Authorization using a Hierarchy
Bill of Materials
...View More
Nobody likes to constantly reinvent the wheel. Completely starting from scratch on something you work on all of the time is a waste of time. People reuse parts of old QVWs all the time to save the time & effort of redoing the same things over and over again.The Dashboard, Analysis, Reporting (DAR) methodology is a foundation you can build all of your applications on while still having room to be creative and meet the varying requirements of individual clients / prospects. In a nutshell you lead with a Dashboard page, followed by Analysis pages, and finish with Reporting pages. The Dashboard gives the high level overview of the business, the Analysis pages give interactive user-driven controls to filter the data, while the Reporting pages give the most granular details. The system works on a few levels but to understand some of why it works we have to discuss how people interact with computers and how we perceive information. The attached technical paper gives an introduction into Human Computer Interaction (HCI), about Top Down versus Bottom Up perception, and how all of this comes into play in QlikView.
...View More
Variables can be used in many ways in QlikView. They can have static values or they can be calculated. But when are they calculated? At script run-time or when the user clicks? And how should they be called? With or without dollar expansion?
A few months ago, I worked on a ROI Calculator project that allowed me to use QlikView in a way that is not often used. In this application, users entered various data and assumptions into input boxes that were then used to calculate some financial figures in the script. One requirement of the application was that users were able to save the assumptions/scenarios they entered and load these scenarios at a later time as a starting point or basis for other scenarios. I thought I would share how I used input boxes, variables and reloads to fulfill the requirements.In this example, I simplified the application so that I could focus on what is being done and not so much the calculations in the script. In the example I created, students can enter their grades to see their average score.Once they enter their data they can click the Save button to store what they entered into a QVD. Users can also use the Load saved scenario button to select a previous entry to display in the input boxes. When users click the Load saved scenario button, they are prompted with the window below where they can select the scenario they want to display in the input boxes.Reloads are used often in this application to either get the scenario the user wants to see from the scenarios QVD or to save the data entered into the scenarios QVD. Variables are also used a great deal in the input boxes like this:There is a variable for each input box on the user interface. There are also various variables used to determine what action the user would like to perform. For example, in the script the first thing I check for is whether the user wants to load a selected scenario. The vLoadScenarios variable is set to 1 when the user clicks the Load saved scenario button. When vLoadScenarios is equal to 1, the data for the selected scenario is loaded and the variables used in the input boxes are set to the saved values. Then all the scenarios are loaded into the application. Other parts of the script handle creating a table with the entered data and storing it into a QVD. To view this example application in action and to see the full script, check out my technical brief.Using input boxes in this manner provides ease of use when entering data. In the case of the ROI Calculator application, there was several input boxes so being able to use previously entered scenarios as a basis for new scenarios allowed users to “tweak” or make a few changes to their assumptions to determine the ROI impact.Thanks,Jennell
...View More
Now that the waves from last week’s post about Count(distinct …) have settled, it is time for me to draw some conclusions.
First, I must say that it is astonishing that no one – including myself – sooner openly questioned the assertion that Count(distinct) is single-threaded and slow. We have all had plenty of time to do so: It is true that Count(distinct) was single-threaded and slow in version 7.52, but it was fixed already for version 8 (I double-checked it), which was released in 2007.
By the way, you can see what it looks like in 7.52 in the picture below. The two charts to the right are both finished, but the Count(distinct …) to the left is still calculating, using only one of the four cores (CPU usage = 25%). Hence, slow and single-threaded.
Compare this with the corresponding screen dump from 11.20, where it is obvious from the progress bars that Count(distinct...) is faster than the alternative ways to calculate the same number.
My first conclusion is that we need to sometimes challenge “the truth”. Because, in the software world, the truth changes. What was true yesterday is not necessarily true tomorrow. And if the map and the reality conflict, one must never forget that the map is only an incomplete model of the real world...
Further, from a technical perspective, we can see that:
Charts are calculated faster if the fields used for the aggregation reside in the largest table, in the fact table. This becomes especially true when the fact table is large.
Charts are calculated slightly faster if also the dimension fields reside in the fact table.
The above conclusions are valid only if there is an “external” dimension involved, i.e. that the chart needs to make the calculation linking over the fact table. A single number in a text box, or chart where the fact table isn't involved, will not be affected the same way.
Does this mean that you should join everything together into one, single, de-normalized fact table? For fact tables with very many records, the answer is probably “Yes”. In such cases, you will most likely need to optimize for speed, and then you should probably put everything in one table.
But when doing so, the app uses more RAM. Further, you lose clarity, as Jay Jakosky points out already in a comment of the previous blog.
A normalized model usually has a simpler script, and is easier to understand for the person that has to maintain the application or develop it further. It is simpler to make modifications, to add tables or calculations, and to create correct formulae, if the data model and the script are conceptually simple. So for smaller data sets, where the chart response time already is acceptably low, I claim that you should not optimize for speed.
Instead, you should optimize for clarity and maintainability, which means keeping things simple; i.e. keeping the data model normalized. See more in To Join or not to Join.
Bottom line: Keep it as simple as possible, and don’t make unnecessary optimizations.
"Premature optimization is the root of all evil." - Donald Knuth
HIC
...View More
We have been talking a lot about color recently. Chuck showed us how to manage colors and alpha transparencies to increase visual perception in our dashboards making it clear for business users to consume charts. A few days before, Apeksha shared some good tips about color usage in QlikView apps. If you are looking for more insight about color and interpretations I strongly recommend you to read Apeksha’s Technical Brief.Choosing the right colors.Colors are present in our apps not only in the charts but also in the backgrounds, tab row, captions, and so forth. It’s key to find a color palette that works and that lets business users consume information quickly and in an efficient way.There are lots of content and good examples on the internet about color palettes; sites like kuler.adobe.com can provide you with some nice color combinations for your next project.Some of the most popular color combinations are in colorbrewer2.org. Created by Cynthia Brewer, Mark Harrower and The Pennsylvania State University. It was originally designed for cartographers but it’s a standard in academia for any type of data visualization. (http://colorbrewer2.org/)I created a QV app (see below) with these colors palettes to facilitate their adoption by our community. Feel free to use, improve and distribute it.Testing your colorsAs discussed earlier in this blog, around 7-8% of world’s population has some short of different color perception. To create great looking QlikView apps you should be conscious that some people out there do not see colors as you do.To avoid that potential risk that could ruin your app, you could use one of the pre-designed color blind safe palettes and/or you could test your app using a tool that lets you emulate a color blind environment.
Color Oracle is a free color blindness simulator for Window, Mac and Linux. It takes the guesswork out of designing for color blindness by showing you, in real time, what people with common color vision impairments will see. http://www.colororacle.org/
Reading colorsCommon tasks are time consuming and when it comes to color it’s very common to find yourself trying to reuse corporative colors. It is probably safe to assume that we all have a color picker tool installed in our machines, but in case you do not... My personal favorite is ColorPix.It’s a standard Color picker tool but it has some features I find it very useful when working with QlikView. I use ColorPix mouse accelerators a lot, that let me copy the RGB value by clicking on the displayed number, then I just need to type RGB( Ctrl+C ) into calculated base color expression box to include the color in my app.These are 3 small utilities that make my life much easier but, what about you? Are you using any other tool to help you with your QlikView development? Share it with us in the comments! Enjoy Qliking!AMZExtra: Safe color palettes QV app
...View More
Do you belong to the group of people who think that Count(distinct…) in a chart is a slow, single-threaded operation that should be avoided?
If so, I can tell you that you are wrong.
Well - it used to be single-threaded and slow, but that was long ago. It was fixed already for – I think – version 9, but the rumor of its slowness lives on like an urban myth that refuses to die. Today the calculation is multi-threaded and optimized.
To prove that Count(distinct…) is faster than what many people think, I constructed a test which categorically shows that it is not slower – it is in fact a lot faster than the alternative solutions.
I created a data model with a very large fact table: 1M, 3M, 10M, 30M and 100M records. In it, I created a secondary key, with a large number of distinct values: 1%, 0.1% and 0.01% of the number of records in the fact table.
The goal was to count the number of distinct values of the secondary key when making a selection. There are several ways that this can be done:
Use count distinct in the fact table: Count(distinct [Secondary ID])
Use count on a second table that just contains the unique IDs: Count([Secondary ID Copy])
Use sum on a field that just contains ‘1’ in the second table: Sum([Secondary ID Count])
I also created a dimension ("Dim" in the “Dim Table”) with 26 values, also randomly assigned to the data in the fact table. Then I recorded the response times for three charts, each using “Dim” as dimension and one of the three expressions above. I made this for four different selections.
Then I remade all measurements using “Dim ID” as dimension, i.e. I moved also the dimension to the fact table. Finally, I loaded all the recorded data into QlikView and analyzed it.
The first obvious result is that the response time increases with the number of records in the fact table. This is hardly surprising…
…so I need to compensate for this: I divide the response times with the number of fact table records and get a normalized response time in picoseconds:
This graph is extremely interesting. It clearly shows that if I use a Count(distinct…) on the fact table, I have a response time that is considerably smaller than if I make a count or a sum in a dimensional table. The table below shows the numbers.
Finally, I calculated the ratios between the response times for having the dimension in the fact table vs. in a dimensional table, and the same ratio for making the aggregation in the fact table vs. in a dimensional table.
This graph shows the relative response time I get by moving the dimension or the aggregation into the fact table. For instance, at 100M records, the response time from a fact table aggregation (i.e. a Count(distinct…)) is only 20% of an aggregation that is made in a dimensional table.
This is the behavior on my mock-up data on my four-core laptop with 16GB. If you make a similar test, you may get a slightly different result since the calculations depend very much on both hardware and the data model. But I still think it is safe to say that you should not spend time avoiding the use of Count(distinct…) on a field in the fact table.
In fact, you should consider moving your ID to the fact table if you need to improve the performance. Especially if you have a large fact table.
HIC
...View More
This analogy gives an example of what UX design could be. As you know, an analogy is not the complete truth. The personas (Eileen, Linda and Brandon) used in this analogy are just examples and made up by me. Read more about personas and how we use them:The QlikView JourneyPersonas in UX DesignSo maybe you’ve heard of the Kano model which is a theory for user satisfaction..? If not, google it and you’ll find plenty of diagrams. Basically it says that there are some basic needs that our users expect us to meet, but it’s the “exciters” that will make our users truly delighted. The users may not have asked for such “exciters”, but once in their hands they love it. When Eileen, Linda and Brandon decide to have a cup of coffee, they probably have the same basic expectations: it should kind of taste like coffee and it should be hot. For Eileen, the taste might not be crucial, all she wants is caffeine and she wants it now. She is happy with instant coffee prepared in her own office. Linda has slightly higher demands; she drinks several cups a day by her desk. She is happy using the coffee machine near her cubicle and she’s perfectly fine with that as long as the machine works.Brandon wants his coffee to taste really good. He wants more than the regular coffee provided in the machines. He goes to the local coffee shop at lunch.Then one day, Eileen, Linda and Brandon find a new, top-of-the-line coffee maker in their office kitchenette. There is also a coffee grinder and beans. Someone has brewed them fresh coffee with such a fantastic taste that Eileen usually only enjoys at breakfast on Saturday mornings, that Linda did not even knew existed, and that Brandon now has access to in his office.For Eileen, it was such a delight that someone had prepared for her what she really wanted and with no extra time or hassle.For Linda, it was such a delight to discover what “real” coffee tastes like.And for Brandon, it was such a delight to be able to drink really good coffee together with his colleagues. Brandon now brings different kind of beans, so that everybody gets to try different roast flavors. After all, sharing is caring So how can this be translated into UX design? Well, first of all, whenever we're designing a QlikView app, a QlikView extension or a new generation of the QlikView platform, we need to accomplish a hygiene level that our users expect. Hygiene features refer to basic operations that the users must be able to do in order to complete their tasks, complying to standard interaction techniques, factors that facilitate learnability etc. If done well, the hygiene features won’t even be noticed because it just works. On the other hand, when poorly done, users will notice them because they cause frustration or dissatisfaction. But to offer our users a world class experience, we need to turn our innovation, passion and leading-edge technology into really useful, nifty solutions that will make our users go “wow”! In order to do this, we need to truly understand the driving forces and contexts of our users. So you could say that UX design is about being the users’ advocate, being someone who realizes that Eileen, Linda and Brandon needed a Moccamaster.
...View More
One of the strengths of QlikView is its search engine. With it, you can find pieces of information in a fraction of a second and select the found field values. The response is immediate, which is necessary for the user experience. Without it, you would easily get an empty result set without understanding why.
Search strings can be made in many different ways, and QlikView will respond differently depending on how the search string is defined. Normally you just enter a text, and QlikView will match this against the beginning of the words in the field values. If several strings are entered, QlikView will return the union of the matches of each of the strings.
But if you instead use a wildcard in your search string, the evaluation of the search string will be made in a different way: the entire search string with the wild card will be matched against the entire field value, sometimes yielding more matches, sometimes fewer.
If you want to create more complex search strings (and e.g. store them in actions or bookmarks) you can do this too. Just use (, |, & and double quotes to define the syntax.
In all the above cases, the search and the selection are made in one and the same field. But sometimes you want to make the selection in one field, but make the search in another. This can be done using the associated search, which is an indirect search method. Start with the field where you want to make the selection, enter the search string, and click on the small chevron to the right. You will then get a list of other fields containing this search string. By clicking the desired match, you will narrow down the number of matches in the primary list to show just the relevant values. You can then make your selection by hitting Enter.
Further, did you know that
In the user preferences and in the list box properties, you can define how a default search string should be created, but this does not affect how it is evaluated – only how it is created. Once created, you can add or remove wild cards as you please.
When you make a search and save the resulting selection in a bookmark, the bookmark will contain the search string and not the list of selected values. When the bookmark is applied, it will perform the search and select the found values. If data has changed, this may imply a different search result than before.
You can use the same search string in many places: In list boxes, in Set analysis, in the Advanced search dialog, in actions and in bookmarks.
Bottom line: The search string is a powerful tool that helps you find the values you want. Use it.
HIC
Further reading related to this topic:
Search - But what shall you find?
The Search String
The Expression Search
...View More