Search or browse our knowledge base to find answers to your questions ranging from account questions to troubleshooting error messages. The content is curated and updated by our global Support team
This Techspert Talks session addresses:
00:00 - Intro
01:22 - Multi-Node Architecture Overview
04:10 - Common Performance Bottlenecks
05:38 - Using iPerf to measure connectivity
09:58 - Performance Monitor Article
10:30 - Setting up Performance Monitor
12:17 - Using Relog to visualize Performance
13:33 - Quick look at Grafana
14:45 - Qlik Scalability Tools
15:23 - Setting up a new scenario
18:26 - Look QSST Analyzer App
19:21 - Optimizing the Repository Service
21:38 - Adjusting the Page File
22:08 - The Sense Admin Playbook
23:10 - Optimizing PostgreSQL
24:29 - Log File Analyzer
27:06 - Summary
27:40 - Q&A: How to evaluate an application?
28:30 - Q&A: How to fix engine performance?
29:25 - Q&A: What about PostgreSQL 9.6 EOL?
30:07 - Q&A: Troubleshooting performance on Azure
31:22 - Q&A: Which nodes consume the most resources?
31:57 - Q&A: How to avoid working set breaches on engine nodes?
34:03 - Q&A: What do QRS log messages mean?
35:45 - Q&A: What about QlikView performance?
36:22 - Closing
Resources:
Qlik Help – Deployment examples
Using Windows Performance Monitor
PostgreSQL Fine Tuning starting point
Qlik Sense Shared Storage – Options and Requirements
Qlik Help – Performance and Scalability
Q&A:
Q: Recently I'm facing Qlik Sense proxy servers RAM overload, although there are 4 nodes and each node it is 16 CPUs and 256G. We have done App optimazation, like delete duplicate app, remove old data, remove unused field...but RAM status still not good, what is next to fix the performace issue? Apply more nodes?
A: Depends on what you mean by “RAM status still not good”. Qlik Data Analytics software will allocate and use memory within the limits established and does not release this memory unless the Low Memory Limit has been reached and cache needs cleaning. If RAM consumption remains high but no other effects, your system is working as expected.
Q: Similar to other database, do you think we need to perform finetuning, cleaning up bad records within PostgresQL , e.g.: once per year?
A: Periodic cleanup, especially in a rapidly changing environment, is certainly recommended. A good starting point: set your Deleted Entity Log table cleanup settings to appropriate values, and avoid clean-up tasks kicking in before user morning rampup.
Q: Does QliKView Server perform similarly to Qlik Sense?
A: It uses the same QIX Engine for data processing. There may be performance differences to the extent that QVW Documents and QVF Apps are completely different concepts.
Q: Is there a simple way (better than restarting QS services)to clean the cache, because chache around 90 % slows down QS?
A: It’s not quite as simple. Qlik Data Analytics software (and by extent, your users) benefits from keeping data cached as long as possible. This way, users consume pre-calculated results from memory instead of computing the same results over and over. Active cache clearing is detrimental to performance. High RAM usage is entirely normal, based Memory Limits defined in QMC. You should not expect Qlik Sense (or QlikView) to manage memory like regular software. If work stops, this does not mean memory consumption will go down, we expect to receive and serve more requests so we keep as much cached as possible. Long winded, but I hope this sets better expectations when considering “bad performance” without the full technical context.
Q: How do we know when CPU hits 100% what the culprit is, for example too many concurrent user loading apps/datasets or mutliple apps qvds reloading? can we see that anywhere?
A: We will provide links to the Log Analysis app I demoed during the webinar, this is a great place to start. Set Repository Performance logs to DEBUG for the QRS performance part, start analysing service resource usage trends and get to know your user patterns.
Q: Can there be repository connectivity issues with too many nodes?
A: You can only grow an environment so far before hitting physical limits to communication. As a best practice, with every new node added, a review of QRS Connection Pools and DB connectivity should be reviewed and increased where necessary. The most usual problem here is: you have added more nodes than connections are allowed to DB or Repository Services. This will almost guarantee communication issues.
Q: Does qlik scalability tools measure browser rendering time as well or just works on API layer?
A: Excellent question, it only evaluates at the API call/response level. For results that include browser-side rendering, other tools are required (LoadRunner, complex to set up, expert help needed).
Transcript:
Hello everyone and welcome to the November edition of Techspert Talks. I’m Troy Raney and I’ll be your host for today's session. Today's presentation is Optimizing Performance for Qlik Sense Enterprise with Mario Petre. Mario why don't you tell us a little bit about yourself?
Hi everyone; good to be here with everybody once again. My name is Mario Petre. I’m a Principal Technical Engineer in the Signature Support Team. I’ve been with Qlik over six years now and since the beginning, I’ve focused on Qlik Sense Enterprise backend services, architecture and performance from the very inception of the product. So, there's a lot of historical knowledge that I want to share with you and hopefully it's an interesting springboard to talk about performance.
Great! Today we're going to be talking about how a Qlik Sense site looks from an architectural perspective; what are things that should be measured when talking about performance; what to monitor after going live; how to troubleshoot and we'll certainly highlight plenty of resources and where to find more details at the end of the session. So Mario, we're talking about performance for Qlik Sense Enterprise on Windows; but ultimately, it's software on a machine.
That's right.
So, first we need to understand what Qlik Sense services are and what type of resources they use. Can you show us an overview from what a multi-node deployment looks like?
Sure. We can take a look at how a large Enterprise environment should be set up.
And I see all the services have been split out onto different nodes. Would you run through the acronyms quickly for us?
Yep. On a consumer node this is where your users come into the Hub. They will come in via the Qlik Proxy Service and consume applications via the Qlik Engine Service, that ultimately connects to the central node and everything else via the Qlik Repository Service.
Okay.
The green box is your front-end services. This is what end users tap into to consume data, but what facilitates that in the background is always the Repository Service.
And what's the difference between the consumer nodes on the top and the bottom?
These two nodes have a Proxy Service that balances against their own engines as well as other engines; while the consumer nodes at the bottom are only there for crunching data.
Okay.
And then we can take a look at the backend side of things. Resources are used to the extent that you're doing reloads, you will have an engine there as well as the primary role for the central node, active and failover which is: the Repository Service to coordinate communication between all the rest of the services. You can also have a separate node for development work. And ultimately we also expect the size of an environment to have a dedicated storage solution and a dedicated central Repository Database host either locally managed or in one of the cloud providers like AWS RDS for example.
Between the front-end and back-end services where's the majority of resource consumption, and what resources do they consume?
Most of the resource allocation here is going to go to the Engine Service; and that will consume CPU and RAM to the extent that it's allocated to the machine. And that is done at the QMC level where you set your Working Set Limits. But in the case of the top nodes, the Proxy Service also has a compute cost as it is managing session connectivity between the end user's browser and the Engine Service on that particular server. And the Repository Service is constantly checking the authorization and permissions. So, ultimately front-end servers make use of both front-end and back-end resources. But you also need to think about connectivity. There is the data streaming from storage to the node where it will be consumed and then loading from that into memory. And these are three different groups of resources: you have compute; you have memory, and you have network connectivity. And all three have to be well suited for the task for this environment to work well.
And we're talking about speed and performance like, how fast is a fast network? How can we even measure that?
So, we would start for any Enterprise environment, we would start at a 10 Gb network speed and ultimately, we expect response time of 4 MS between any node and the storage back end.
Okay. So, what are some common bottlenecks and issues that might arise?
All right. So, let's take a look at some at some examples. The Repository Service failing to communicate with rim nodes, with local services. I would immediately try to verify that the Repository Service connection pool and network connectivity is stable and connect. Let's say apps load very very slow for the first time. This is where network speed really comes into play. Another example: the QMC or the Hub takes a very very long time to load. And for that, we would have to look into the communication between the Repository Service and the Database, because that's where we store all of this metadata that we will try to calculate your permissions based on.
And could that also be related to the rules that people have set up and the number of users accessing?
Absolutely. You can hurt user experience by writing complex rules.
What about lag in the app itself?
This is now being consumed by the Engine Service on the consumer node. So, I would immediately try to evaluate resource consumption on that node, primarily CPU. Another great example for is high Page File usage. We prefer memory for working with applications. So, as soon as we try to cache and pull those results again from disk, performance we'll be suffering. And ultimately, the direct connectivity. How good and stable is the network between the end users machine and the Qlik Sense infrastructure? The symptom will be on the end user side, but the root cause is almost always (I mean 99.9% of the time) will be down to some effect in the environment.
So, to get an understanding of how well the machine works and establish that baseline, what can we use?
One simple way to measure this (CPU, RAM, disk network) is this neat little tool called iPerf.
Okay. And what are we looking at here?
This is my central node.
Okay. And iPerf will measure what exactly?
How fast data transfer is between this central node and a client machine or another server.
And where can people find iPerf?
Great question. iPerf.fr
And it's a free utility, right?
Absolutely.
So, I see you've already got it downloaded there.
Right. You will have to download this package, both on the server and the client machine that you want to test between. We'll run this “As Admin.” We call out the command; we specify that we want it to start in “server mode.” This will be listening for connection attempts.
Okay.
We can define the port. I will use the default one. Those ports can be found in Qlik Help.
Okay.
The format for the output in megabyte; and the interval for refresh 5 seconds is perfectly fine. And then, we want as much output as possible.
Okay.
First, we need to run this. There we go. It started listening. Now, I’m going to switch to my client machine.
So, iPerf is now listening on the server machine and you're moving over to the client machine to run iPerf from there?
Right. Now, we've opened a PowerShell window into iPerf on the client machine. Then we call the iPerf command. This time, we're going to tell it to launch in “Client Mode.” We need to specify an IP address for it to connect to.
And that's the IP address of the server machine?
Right. Again, the port; the format so that every output is exactly the same. And here, we want to update every second.
Okay.
And this is a super cool option: if we use the bytes flag, we can specify the size of the data payload. I’m going to go with a 1 Gb file (1024 Mb). You can also define parallel connections. I want 5 for now.
So, that's like 5 different users or parallel streams of activity of 1 Gb each between the server machine and this client machine?
Right. So, we actually want to measure how fast can we acquire data from the Qlik Sense server onto this client machine. We need to reverse the test. So, we can just run this now and see how fast it performs.
Okay. And did the server machine react the same way?
You can see that it produced output on the listening screen. This is where we started. And then it received and it's displaying its own statistics. And if you want to automate this, so that you have a spot check of throughput capacity between these servers, we need to use the log file option. And then we give it a path. So, I’m gonna say call this “iperf_serverside…” And launch it. And now, no output is produced.
Okay.
So, we can switch back to the client machine.
Okay. So, you're performing the exact same test again, just storing everything in a log file.
The test finished.
Okay. So, that can help you compare between what's being sent to what's being received, and see?
Absolutely. You can definitely have results presented in a way that is easy to compare across machines and across time. And initial results gave us a throughput per file of around 43.6, 46, thereabouts megabytes per second.
So, what about for an end user who's experiencing issues? Can you use iPerf to test the connectivity from a user machine on a different network?
Yep. So, in in the background we will have our server; it's running and waiting for connections. And let's run this connection now from the client machine. We will make sure that the IP address is correct; default port; the output format in megabytes; we want it refreshed every second; and we are transferring 1 Gb; and 5 parallel streams in reverse order. Meaning: we are copying from the server to the client machine. And let's run it.
Just seeing those numbers, they seem to be smaller than what we're seeing from the other machine.
Right. Indeed. I have some stuff in between to force it to talk a little slower. But this is one quick way to identify a spotty connection. This is where a baseline becomes gold; being able to demonstrate that your platform is experiencing a problem. And to quantify and to specify what that problem is going to reduce the time that you spend on outages and make you more effective as an admin.
Okay. That was network. How can admins monitor all the other performance aspects of a deployment? What tools are available and what metrics should they be measuring?
Right. That's a great question. The very basic is just Performance Monitor from Windows.
Okay.
The great thing about that is that we provide templates that also include metrics from our services.
Can you walk us through how to set up the Performance Monitor using one of those templates?
Sure thing. So, we're going to switch over first to the central node. So, the first thing that I want to do is create a folder where all of these logs will be stored.
Okay. So, that's a shared folder, good.
And this article is a great place to start. So, we can just download this attachment
So, now it's time to set up a Performance Monitor proper. We need to set up a new Data Collector Set.
Giving it a name.
And create from template. Browse for it, and finish.
Okay. So it’s got the template. That's our new one Qlik Sense Node Monitor, right?
Yep. You'll have multiple servers all writing to the same location. The first thing is to define the name of each individual collector; and you do that here. And you can also provide subdirectory for these connectors, and I suggest to have one per node name. I will call this Central Node.
Everything that comes from this node, yeah.
Correct. You can also select a schedule for when to start these. We have an article on how to make sure that Data Collectors are started when Windows starts. And then a stop condition.
Now, setting up monitors like this; could this actually impact performance negatively?
There is always an overhead to collecting and saving these metrics to a file. But the overhead is negligible.
Okay.
I am happy with how this is defined. Now, this static collector on one of the nodes is already set up. There is an option here that's called Data Manager. What's important here to define is to set a Minimum Free Disk. We could go with 10 Gb, for example; and you can also define a Resource Policy. The important bit is Minimum Free Disk. We want to Delete the Oldest (not the largest) in the Data Collector itself. We should change that directory and make sure that it points to our central location instead of locally; and we'll have to do this for every single node where we set this up.
Okay. So, that's that shared location?
Yep.
And you run the Data Collector there. And it creates a CSV file with all those performance counters. Cool.
So, here we have it now. If we just take a very quick look inside, we'll see a whole bunch of metrics. And if you want to visualize these really really quick, I can show you a quick tip that wasn't on the agenda but since we're here: on Windows, there is a built-in tool called Relog that is specifically designed for reformatting Performance Monitor counters. So, we can use Relog; we'll give it the name of this file; the format will be Binary; the output will be the same, but we'll rename it to BLG; and let's run it.
And now it created a copy in Binary format. Cool thing about this Troy is that: you can just double click on it.
It's already formatted to be a little more readable. Wow! Check that out.
There we go. Another quick tip: since we're here, first thing to do is: select everything and Scale; just to make sure that you're not missing any of the metrics. And this is also a great way to illustrate which service counters and system counters we collect. As you can see, there's quite a few here.
Okay. So, that Performance Monitor is, it's set up; it's running; we can see how it looks; and that is going to run all the time or just when we manually trigger it?
You can definitely configure it to run all the time, and that would be my advice. Its value is really realized as a baseline.
Yeah. Exactly. That was pretty cool seeing how that worked, using all the built-in utilities. And that Relog formatting for the Process Monitor was new to me. Are there any other tools you like to highlight?
Yeah. So, Performance Monitor is built-in. For larger Enterprises that may already be monitoring resources in a centralized way, there's no reason why you shouldn't expect to include the Sense resources into that live monitoring. And this could be done via different solutions out there. A few come to mind like: Grafana, Datadog, Butler SOS, for example from one of our own Qlik luminaries.
Can we take a quick look at Grafana? I’ve heard of that but never seen it.
Sure thing. This is my host monitor sheet. It's nowhere built to a corporate standard, but you can see here I’m looking at resources for the physical host where these VMs are running as well as the domain controller, and the main server where we've been running our CPU tests. And the great part about this is I have historical data as far back I believe as 90 days.
So, this is a cool tool that lets you like take a look at the performance and zoom-in and find the processes that might be causing some peaks or anything you want to investigate?
Right. Exactly. At least come up with a with a narrow time frame for you to look into the other tools and again narrow down the window of your investigation.
Yeah, that could be really helpful. Now I wanted to move on to the Qlik Sense Scalability Tools. Are those available on Qlik community?
That's right. Let me show you where to find them. You can see that we support all current versions including some of the older ones. You will have to go through and download the package and the applications used for analysis afterwards. There is a link over here. So, once the package is downloaded, you will get an installer. And the other cool thing about Scalability Tools is that you can use it to pre-warm the cache on certain applications since Qlik Sense Enterprise doesn't support application pre-loading.
Oh, cool. So, you can throttle up applications into memory like in QlikView. Can we take a look at it?
Yes, absolutely. This is the first thing that you'll see. We'll have to create a new connection. So, I’ll open a simple one that I’ve defined here and we can take a look at what's required just to establish a quick connection to your Qlik Sense site.
Okay, but basically the scenario that you're setting up will simulate activity on a Qlik Sense site to test its performance?
Exactly. You'll need to define your server hostname. This can be any of your proxy nodes in the environment. The virtual proxy prefix. I’ve defined it as Header and authentication method is going to be WebSocket.
Okay.
And then, if we want to look at how virtual users are going to be injected into the system, scroll over here to the user section. Just for this simple test, I’ve set it up for User List where you can define a static list of users like so: User Directory and UserName.
Okay. So, it's going to be taking a look at those 2 users you already predefined and their activity?
Exactly. We need to test the connection to make sure that we can connect to the system. Connection Successful. And then we can proceed with the scenario. This is very simple but let me show you how I got this far. So, the very first thing that we should do is to Open an App.
So, you're dragging away items?
Yep. I’m removing actions from this list. Let's try to change the sheet. A very simple action. And now we have four sheets, and we'll go ahead and select one of them.
Okay, so far, we have Opening the App and immediately changing to a sheet?
Yep. That's right. This will trigger actions in sequence exactly how you define them. It will not take into consideration things like Think Time. I will just define a static weight of 15 seconds, and then you can make selections.
But this is an amazing tool for being able to kind of stress test your system.
It's very very useful and it also provides a huge amount of detail within the results that it produces. One other quick tip: while defining your scenario, use easy to read labels, so that you can identify these in the Results Application. Let's assume that the scenario is defined. We will go ahead and add one last action and that is: to close, to Disconnect the app. We'll call this “OpenApp.” We'll call this “SheetChange.” Make sure you Save. The connection we've tested; we've defined our list of users. First, let's run the scenario. There is one more step to define and that is: to configure an Executor that will use this scenario file to launch a workload against our system. Create a New Sequence.
This is just where all these settings you're defining here are saved?
Correct. This is simply a mapping between the execution job that you're defining and which script scenario should be used. We'll go ahead and grab that. Save it again; and now we can start it. And now in the background if we were to monitor the Qlik Sense environment, we would see some amount of load coming in. We see that we had some kind of issue here: empty ObjectID. Apparently I left something in the script editor; but yeah, you kind of get the idea.
So, all this performance information would then be loaded into an app that is part of the package downloaded from Qlik community. How does that look?
So, here you will see each individual result set, and you can look at multiple-exerciser runs in the single application. Unfortunately, we don't have more than one here to showcase that, but you would see multiple-colored lines. There is metrics for a little bit of everything: your session ramp, your throughput by minute, you can change these.
CPU, RAM. This is great.
Exactly. CPU and RAM. These are these are not connected. We don't have those logs, but you would have them for a setup run on your system. These come from Performance Monitor as well, so you could just use those logs provided that the right template is in place. We see Response Time Distribution by Action, and these are the ones that I’ve asked you to change and name so that they're easy to understand.
Once your deployment is large enough to need to be multi-node and the default settings are no longer the best ones for you, what needs to be adjusted with a Repository Service to keep it from choking or to improve its performance?
That's a great question Troy. So, the first thing that we should take a look at is how the Repository communicates with the backend Database and vice versa. The connection pool for the Repository is always based on core count on the machine. And the best rule of thumb that we have to date is to take your core count on that machine, multiply it by 5, and that will be the max connection pool for the Repository Service for that node.
Can you show us where that connection pool setting can be changed?
Yes. So, we will go ahead and take a look. Here we are on the central node of my environment. You'll have to find your Qlik installation folder. We'll navigate to the Repository folder, Util, QlikSenseUtil, and we'll have to launch this “As Admin.”
Okay.
We'll have to come to the Connection String Editor. Make sure that the path matches. We just have to click on Read so that we get the contents of these files. And the setting that we are about to change is this one.
Okay. So, the maximum number of connections that the Repository can make?
Yes. And this is (again) for each node going towards the Repository Database.
Okay.
Again, this should be a factor of CPU cores multiplied by 5. If 90 is higher than that result, leave 90 in place. Never decrease it.
Okay, that's a good tip.
Right. I change this to 120. I have to Save. What I like to do here is: clear the screen and hit Read again; just to make sure that the changes have been persisted in the file.
Okay.
Once that's done, we can close this. We can restart the environment. We can get out of here.
So, there you adjusted the setting of how many connections this node can make to the QSR. Then assuming we do the same on all nodes, where do we adjust the total number of connections the Repository itself can receive?
That should be a sum of all of the connection strings from all of your nodes plus 110 extra for the central node. By default, here is where you can find that config file: Repository, PostgreSQL, and we'll have to open this one, PostgreSQL. Towards the end of the file…
Just going all the way to the bottom.
Here we have my Max Connections is 300.
Okay. One other setting you mentioned was the Page File and something to be considered. How would we make changes or adjust that setting?
Right. So, this is a Windows level setting that's found in Advanced System Settings; Advanced tab; Performance; and then again Advanced; and here we have Virtual Memory.
Okay.
We have to hit Change. We'll have to leave it at System Managed or understand exactly which values we are choosing and why. If you're not sure, the default should always be System Managed.
Now, I want to know what resources are available for Qlik Sense admins; specifically, what is the Admin Playbook?
It's a great starting place for understanding what duties and responsibilities one should be thinking about when administering a Qlik Sense site.
So, these are a bunch of tools built by Qlik to help analyze your deployment in different ways. I see weekly, monthly, quarterly, yearly, and a lot of different things are available there.
Yeah. So, we can take a look at Task Analysis, for example. The first time you run it, it's going to take about 20 minutes; thereafter about 10. The benefits: it shows you really in depth how to get to the data and then how to tweak the system to work better based on what you have.
Yeah, that's great.
Right? So, not only we put the tools in your hands, but also how to build these tools as you can here. See here, we have instructions on how to come up with these objects from scratch. An absolute must-read for every system admin out there.
Mario, we've talked about optimizing the Qlik Sense Repository Service, but not about Postgres? Do larger Enterprise level deployments affect its performance?
Sure. The thing about Postgres is again: we have to configure it by default for compatibility and not performance. So, it's another component that has to be targeted for optimization.
The detail there that anything over 1 Gb from Postgres might get paged - that sounds like it could certainly impact performance.
Right, because the buffer setting that we have by default is set to 1 Gb; and that means only 1 Gb of physical memory will be allocated to Postgres work. Now, we're talking about the large environment 500 to maybe 5,000 apps. We're talking 1000s of users with about 1000 of them peak concurrency per hour.
So, can we increase that Shared Buffer setting?
Absolutely. And in fact, I want to direct you to a really good article on performance optimization for PostgreSQL. And when we talk about fine-tuning, this article is where I’d like to get started. We talk about certain important factors like the Shared Buffers. So, this is what we define to 1 Gb by default. Their recommendation is to start with 1/4 of physical memory in your system. 1 Gb is definitely not one quarter of the machines out there. So, it needs tweaking.
And again these are settings to be changed on the machine that's hosting the Repository Database, right?
That's correct. That's correct.
Now, is there an app that you're aware of that would be good to kind of look at all these logs and analyze what's going on with the performance?
Absolutely. This is an application that was developed to better understand all of the transactions happening in a particular environment. It reads the log files collected with the Log Collector either via the tool or the QMC itself.
Okay.
It's not built for active monitoring, but rather to enhance troubleshooting.
Sure. So, basically it's good for looking at a short period of time to help troubleshooting?
Right. The Repository itself communicates over APIs between all the nodes and keeps track of all of the activities in the system; and these translate to API calls. If we want to focus on Repository API calls, we can start by looking at transactions.
Okay.
So, this will give us detail about cost. For example, per REST call or API call, we can see which endpoints take the most, duration per user, and this gives you an opportunity to start at a very high level and slowly drill in both in message types and timeframe. Another sheet is the Threads Endpoints and Users; and here you have performance information about how many worker-threads the Repository Service is able to start, what is the Repository CPU consumption, so you can easily identify one. For example, here just by discount, we can see that the preview privileges call for objects is called…
Yeah, a lot.
Over half a million times, right? And represents 73% of the CPU compute cost.
Wow, nice insights.
And then if we look here at the bottom, we can start evaluating time-based patterns and select specific time frames and go into greater detail.
So, I’m assuming this can also show resource consumption as well?
Right. CPU, memory in gigabytes and memory in percent. One neat trick is: to go to the QMC, look at how you've defined your Working Set Limits, and then pre-define reference lines in this chart. So, that it's easier to visualize when those thresholds are close to being reached or breached. And you do that by the add-ons reference lines, and you can define them like this.
That's just to sort of set that to match what's in the QMC?
Exactly.
Makes a powerful visualization. So, you can really map it.
Absolutely. And you can always drill down into specific points in time we can go and check the log details Engine Focus sheet; and this will allow us to browse over time, select things like errors and warnings alone, and then we will have all of the messages that are coming from the log files and what their sources.
Yeah. That's great to have it all kind of collected here in one app, that's great.
Indeed.
To summarize things, we've talked about to understand system performance, a baseline needs to be established. That involves setting up some monitoring. There are lots of options and tools available to do that; and it's really about understanding how the system performs so the measurement and comparisons are possible if things don't perform as expected.
And to begin to optimize as well.
Okay, great. Well now, it's time for Q&A. Please submit your questions through the Q&A panel on the left side of your On24 console. Mario, which question would you like to address first?
We have some great questions already. So, let's see - first one is: how can we evaluate our existing Qlik Sense applications?
This is not something that I’ve covered today, but it's a great question. We have an application on community called App Metadata Analyzer. You can import this into your system and use it to understand the memory footprint of applications and objects within those applications and how they scale inside your system. It will very quickly illustrate if you are shipping applications with extremely large data files (for example) that are almost never used. You can use that as a baseline for both optimizing local applications and also in your efforts to migrating to SaaS, if you feel like you don't want to bother with all of this Performance Monitoring and optimization, you can always choose to use our services and we'll take care of that for you.
Okay, next question.
So, the next question: worker schedulers errors and engine performance. How to fix?
I think I would definitely point you back to this Log Analysis application. Load that time frame where you think something bad happened, and see what kind of insights you can you can get by playing with the data, by exploring the data. And then narrow that search down if you find a specific pattern that seems like the product is misbehaving. Talk to Qlik support. We'll evaluate that with you and determine whether this is a defect or not or if it's just a quirk of how your system is set up. But that Sense Log Analysis app is a great place to start. And going back to the sheet that I showed: Repository and Engine metrics are all collected there. And these come from the performance logs that we already produce from Qlik Sense. You don't need to load any additional performance counters to get those details.
Okay.
All right. So, there is a question here about Postgres 9.6 and the fact that it's soon coming end of life. And I think this is a great moment to talk about this. Qlik Sense client-managed or Qlik Sense Enterprise for Windows supports Postgres 12.5 for new installations since the May release. If you have an existing installation, 9.6 will continue to be used; but there is an article on community on how to in-place upgrade that to 12.5 as a standalone component. So, you don't have to continue using 9.5 if your IT policy is complaining about the fact that it's soon coming to the end of life. As we say, we are aware of this fact; and in fact, we are shipping a new version as of the May 2021 release.
Oh, great.
So, here's an interesting question. If we have Qlik Sense in Azure on a virtual machine, why is the performance so sluggish? How do you fine-tune it? I guess first we need to understand what would you mean by sluggish? But the first thing that I want to point to is: different instance types. So, virtual machines in virtual private cloud providers are optimized for different workloads. And the same is true for AWS, Azure and Google Cloud platform. You will have virtual machines that are optimized for storage; ones that are optimized for compute tasks or application analytics; some that are optimized for memory. Make sure that you've chosen the right instance type and the right level of provisioned iOps for this application. If you feel that your performance is sluggish, start increasing those resources. Go one tier up and reevaluate until you find a an instance type that works for you. If you wish to have these results (let's say beforehand), you will have to consider using the Scalability Tools together with some of your applications against different instance types in Azure to determine which ones work best.
Just to kind of follow up on that question, if we're looking at that multi-node example from Qlik help, what nodes would you consider would require more resources?
Worker nodes in general. And those would be front and back-end.
So, a worker node is something with an engine, right?
Exactly. Something with an engine. It can either be front-facing together with a proxy to serve content, or back-end together with a scheduler a service to perform reload tasks. These will consume all the resources available on a given machine.
Okay.
And this is how the Qlik Sense engine is developed to work. And these resources are almost never released unless there is a reason for it, because us keeping those results cached is what makes the product fast.
Okay.
Oh, here's a great one about avoiding working set breaches on engine nodes. Question says: do you have any tips for avoiding the max memory threshold from the QIX engine? We didn't really cover this this aspect, but as you know the engine allows you to configure memory limits both for the lower and higher memory limit. Understanding how these work; I want to point you back to that QIXs engine white paper. The system will perform certain actions when these thresholds are reached. The first prompt that I have for you in this situation is: understand if these limits are far away from your physical memory limit. By default, Qlik Sense (I believe) uses 70 / 90 as the low and high working sets on a machine. With a lot of RAM, let's say 256 - half a terabyte of RAM, if you leave that low working set limit to 70 percent, that means that by default 30 of your physical RAM will not be used by Qlik Sense. So. always keep in mind that these percentages are based on physical amount of RAM available on the machine, and as soon as you deploy large machines (large: I’m talking 128 Gb and up) you have to redefine these parameters. Raise them up so that you utilize almost all of the resources available on the machine ,and you should be able to visualize that very very easily in the Log Analysis App by going to Engine Load sheet and inserting those reference lines based on where your current working sets are. Of course, the only way really to avoid a working set limit issue is to make sure that you have enough resources. And the system is configured to utilize those resources, so even if you still get them after raising the limit and allowing the - allowing the product to use as much RAM as it can without of course interfering with Windows operations (which is why you should never set these to like 99, 98, 99). Windows needs RAM to operate by itself, and if we let Qlik Sense to take all of it, it will break things. If you've done that and you're still having performance issues, that means you need more resources.
Yeah. It makes sense.
Oh, so here is another interesting question about understanding what certain Qlik Repository Service (QRS) log messages say. There is a question here that says: try to meet the recommendation of network and persistence the network latency should be less than 4 MS, but consistently in our logs we are seeing the QRS security management retrieved privileges in so many milliseconds. Could this be a Repository Service issue or where would you suggest we investigate first? This is an info level message that you are reporting. And it's simply telling you how long it took for the Repository Service to compute the result for that request. That doesn't mean that this is how long it took to talk to the Database and back, or how long it took for the request to reach from client to the server; only how long it took for the Repository Service to look up the metadata look up the security rules and then return a result based on that. And I would say this coming back in 384 milliseconds is rather quick. It depends on how you've defined these security rules. If these security rules are super simple and you are still getting slow responses, we would definitely have to look at resource consumption. But if you want to know how these calls affect resource consumption on the Repository and Postgres side, go back to that Log Analysis App. Raise your Repository performance logs in the QMC to Debug levels so that you get all of the performance information of how long each call took to execute. And try to establish some patterns. See if you have calls that take longer to execute than others; and where are those coming from any specific apps, any specific users? All of these answers come from drilling down into the data via that app that I demoed.
Okay Mario, we have time for one last question.
Right. And I think this is an excellent one to end. We talked a whole bunch here about Qlik Sense, but all of this also applies to QlikView environments. We are always looking at taking a step back and considering all of the resources that are playing in the ecosystem, not just the product itself. And the question asks: is QlikView Server performance similar to how it handles resources Qlik Sense? The answer is: yes. The engine is exactly the same in both products. If you read that white paper, you will understand how it works in both QlikView and Qlik Sense. And the things that you should do to prepare for performance and optimization are exactly the same in both products. Excellent question.
Great. Well, thank you very much Mario!
Oh, it's been my pleasure Troy. That was it for me today. Thank you all for participating. Thank you all for showing up. Thank you Troy for helping me through this very very complicated topic. It's been a blast as always. And to our customers and partners, looking forward to seeing your questions and deeper dives into logs and performance on community.
Okay, great! Thank you everyone! We hope you enjoyed this session. Thank you to Mario for presenting. We appreciate getting experts like Mario to share with us. Here's our legal disclaimer and thank you once again. Have a great rest of your day.
Qlik Cloud is a modern analytics and data platform built on the same software engine as QlikView and Qlik Sense Client-Managed and adds significant value to empower everyone in an organization to make better decisions daily. Qlik Cloud allows you to use one common platform for all users – executives, decision-makers, and analysts.
Migrating to Qlik Cloud can help your organization:
This site provides you the tools to monitor, manage, and execute a migration from Client-Managed Qlik Sense to Qlik Cloud.
No two client-managed Qlik Sense Enterprise deployments are the same. And no two migrations will be the same. The processes, procedures, and instructions in this section shouldn’t be considered a cookbook. Rather, they’re meant to guide you.
The Qlik Cloud Migration Center provides a general approach to migration along with sequencing, strategy, and best practice recommendations. It also includes tools such as a Qlik Sense app, scripts, and worksheets to aid in planning elements of the migration.
If your organization has a complex deployment with custom tooling, or sophisticated or complicated data integration pipelines, consider contacting your Qlik Customer Support representative.
This site provides comparisons of QlikView and Qlik Cloud, as well as best practices on how to move content, including information about migration assessments and QlikView document conversions.
Move two tables (archive and non-archive) data to same S3 bucket and sub folder. Our assumptions are that both tables share the same DB schema and will not contain duplicate primary key.
Option 1:
You will have to create a custom script to move or copy files over to same sub folder (you may need to engage developers). Once the script is ready you can follow the instruction to configure S3 endpoint post upload processing in replicate.
Replicate will execute the command once the replication is complete. This way the files will be copied/moved to the same folder.
Option 2:
If the source has archive and non-archive tables and both the tables share same DB Schema, then you could use global rules and rename one table as the other. This will allow Replicate to merge both tables.
8. Enter the Schema name and Table name then select Next (if you want to merge table A with B the enter table A schema name and table name) and Next
9. Select "Rename table to" and enter the destination table name (if you want to merge table A with B then enter B table name) and Next
10. Enter Name and click finish
11. Start your task
Environment
Qlik Replicate 2021.5 and above
Replicating data from Oracle SQL views (with few fields defaulted to NULL explicitly) into snowflake fails to create table in Snowflake.
Environment
Resolution
The work around below converts all the string (0) to string (10 or the number you specified). It is important to specify the scope under advanced options to convert only string (0) to string (10).
Cause:
NULL columns are not supported and Oracle SQL views with fields defaulted to NULL explicitly, creates fields with data type string (0).
The information in this article is provided as-is and to be used at own discretion. Depending on tool(s) used, customization(s), and/or other factors ongoing support on the solution below may not be provided by Qlik Support.
A common use of filter expressions is to check a field in the record for a specific value and then either allow or not allow the delete transaction to replicate to the target.
The task was set up to filter deletes from the source with a transformation expression to check a field for a value of 1
CASE WHEN $AR_H_STREAM_POSITION != ''
THEN
CASE WHEN
$CompCode=1
THEN 1
END
ELSE 1
END
Warning message in the task logs:
"Table "Schema.Table" (Subtask 0 thread 1) is suspended. Cannot compute expression, not all column values are in the data record.
As we were discussing this issue we realized that on a delete statement from the source the only fields that were in the journal were before image fields. There were no after image fields which would have the correct field name "$CompCode"
When we changed the filter expression to use the before image field name it worked okay "$BI_CompCode" as the before image was included in the journal entry.
NOTE: There can be other reasons why a field that is referenced in an expression may not be in the journal or transaction log from the source. If you get an error message like this one you can help debug it by turning on the __CT tables to see what they are receiving from the source.
All on-premise Qlik Products can be downloaded from Qlik's Product Download Site.
To access the Download Site, you need an active QlikID. You will be able to see all products your account is eligible for.
If you encounter issues with the download site, start a chat with us and we will be able to help you right away.
Click here for video transcript
Troubleshooting the error: Error adding license LEF does not give server permissions when applying a Qlik GeoAnalytics license.
The LEF must have the property GEO_SERVER;YES;;
000003138
Qlik has local training offices around the world. Please contact a local representative for Training and Certification related issues.
You can find Regional contact details at Training Contacts.
Please review the Frequently Asked Questions prior to reaching out.
Qlik Support communicates information on new releases and Support related activities (Webinars and Q&As) on the Qlik Support Updates blog.
To subscribe to the blog:
This will alert you for activities such as:
This guide explains what to do when receiving a training card and how to proceed to book a course in order to redeem it.
You are a prospective customer and are looking to purchase a Qlik Cloud subscription.
You are a prospective customer, having never purchased Qlik licenses before and are interested in learning more about Qlik products and making a purchase:
You are an existing Customer, having previously purchased a license directly from Qlik and wish to purchase additional licenses or CALs:
You are an existing Customer and purchased your licenses through a third party (not directly from Qlik) and wish to purchase additional licenses or CALs:
By default the task history in QlikView is 30 days, sometimes customers want to extend that further. If so, open this Knowledge article: QlikView Management Service Publisher Related Configuration Settings for Best Practices and potential performance improvements
In particular, we will need to add this section: NbrOfDaysToKeepTaskExecutionHistoryItems
This setting determines the number of days to retain the Task Execution History items within the History files in the Distribution Service Application Data folder. Note: in large environments where there are a large number of tasks being run daily, it may be necessary to reduce this value to a few days. Files may be archived out to another folder prior to deletion if the history is desired. The setting should be changed on all clustered nodes in the environment.
This setting was introduced in the 12.10 and later tracks and is hidden, which means it must be manually added to the config file in the ****** Program Settings ****** section.
The default value is "30" days.
Steps:
QlikView 12.20 and later
The information in this article is provided as-is and to be used at own discretion. Depending on tool(s) used, customization(s), and/or other factors ongoing support on the solution below may not be provided by Qlik Support.
To allow the option to execute parallel loads when using the Qlik SAP Extractor Connector, an additional service must be present and properly configured. The service is called Qlik SAP Network Server service.
Before proceeding, note that, it is only possible to have one service per SAP environment running to avoid a mismatch. For example if a customer has different Development, Test and Production environments it is recommended to have one service per environment. However, they cannot be installed on the same server, it needs to have a different IP-numbers.
In the event of upgrading the connectors, the service has to be stopped and the file SrvService.exe should be replaced with the new version before starting the service again.
The information in this article is provided as-is and to be used at own discretion. Depending on tool(s) used, customization(s), and/or other factors ongoing support on the solution below may not be provided by Qlik Support.
Qlik Replicate fails to connect to Azure Synapse Environments.
Example error:
00004624: 2022-05-13T14:22:15:620371 [SERVER ]T: Exit ODBC Provider supported data types (ar_odbc_conn.c:1236) 00004624: 2022-05-13T14:22:15:620371 [SERVER ]V: Construct statement execute internal: 'select count(*) from sys.symmetric_keys where name like '%DatabaseMasterKey%';' (ar_odbc_stmt.c:3962) 00004624: 2022-05-13T14:22:15:667304 [SERVER ]V: Execute: 'select count(*) from sys.symmetric_keys where name like '%DatabaseMasterKey%';' (ar_odbc_stmt.c:2707) 00004624: 2022-05-13T14:22:15:812588 [SERVER ]T: Master key was not found , please create master key [1022503] (cloud_imp.c:3777)
Verify with your database administrator if the Qlik Replicate user has all the required permissions.
Examples queries that can be run from SSMS (using the Qlik Replicate user):
select count(*) from sys.symmetric_keys where name like '%DatabaseMasterKey%';
In comparison run the below query with any admin user on the database:
select * from sys.symmetric_keys where name like '%DatabaseMasterKey%';
This article aims to explain how SQL server T-log cleans up works when Microsoft replication\publication is enabled on the database.
When the Qlik Replicate task first time runs to capture CDC, Qlik Replicate will create a publication on the database with required articles. As part of this publication log reader agent job also will be created and this job will continuously run to mark replicated transactions on the database.
Apart from the Replicate process, there will be a transactional log backup job that will run every 15 mins or 30 mins depends on source team policy. As part of this log backup job, all the transactions will be backup up to that point in time and truncate all replicated and committed transactions from T-log.
Assume, there is a scheduled t-log backup job going to run at 10 am, and the replicate task is reading transaction log with 5 mins latency, there is a high possibility that the backup job will remove the transaction(s) from a transactional log which hasn't read by the Qlik Replicate. In this scenario, the Qlik Replicate task will be failed with a missing LSN error.
To prevent such kinds of issues Qlik Replicate implemented an option to hold T-log for a couple of mins without truncating based on the below setting:
Qlik Replicate creates an internal table called attrep_truncation_safeguard on the source database and always runs two update queries (2 update queries for each Qlik Replicate task running on the database) without commit (called Latch Lock A and B), only when you enable Start transactions in the database setting on source SQL endpoint. Qlik Replicate will update the time on these queries every 5 mins by default and we can control time by using an Option called "Apply TLOG truncation prevention policy every (seconds): ".
Here are the screenshots to explain how to check these open transactions on the database.
The Upgrade of Qlik Sense Enterprise for Windows fails on the Qlik Dispatcher Service with:
ERROR: schema "MobilityRegistrarService" already exists
The solution to this issue requires modifications to the database. Prior to applying the solution, take a backup. See Backup and Restore.
To resolve this, we will rename the schema 'qlik_mobility_registrar_service' (i.e. qlik_mobility_registrar_service_old) prior to performing the installation/upgrade.
This will allow for the creation of the schema. In multi-node environments, it is suggested to rename the schema prior to each rim node installation/upgrade.
To rename the schema, perform the following steps:
The upgrade from November 2019 to February 2020 failed on the dispatcher service with the following error on the installation log:
CAQuietExec: MobilityRegistrarService configuration started.
CAQuietExec: WARNING: Skiping the database initialization. No superuser or password
CAQuietExec: specified.
CAQuietExec: Creating schema 'qlik_mobility_registrar_service'.
CAQuietExec: True
CAQuietExec: Error executing database command "CREATE SCHEMA
CAQuietExec: "qlik_mobility_registrar_service" AUTHORIZATION "qliksenserepository"": ERROR:
CAQuietExec: schema "qlik_mobility_registrar_service" already exists
CAQuietExec: At C:\Program Files\Qlik\Sense\MobilityRegistrarService\install\install-utils\P
CAQuietExec: ostgres.ps1:113 char:11
CAQuietExec: + throw "Error executing database command ...
CAQuietExec: + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
CAQuietExec: + CategoryInfo : OperationStopped: (:) , Exception
CAQuietExec: + FullyQualifiedErrorId : Error executing database command "CREATE SCHEMA
CAQuietExec: "qlik_mobility_registrar_service" AUTHORIZATION "qliksenserepository"": ER
CAQuietExec: ROR: schema "qlik_mobility_registrar_service" already exists
CAQuietExec:
CAQuietExec: Error 0x80070001: Command line returned an error.
CAQuietExec: Error 0x80070001: QuietExec Failed
CAQuietExec: Error 0x80070001: Failed in ExecCommon method
CustomAction CA_SetupQMR returned actual error code 1603 (note this may not be 100% accurate if translation happened inside sandbox)
Qlik Compose 2021.5
In the Qlik Compose 2021 May release, when a landing table had a foreign key, discovering the table from a snowflake source would result in the following error:
System.ArgumentOutOfRangeException: Specified argument was out of the range of valid values.
Information provided on this defect is given as is at the time of documenting. For up-to-date information, please review the most recent Release Notes, or contact support with ID RECOB-3717 for reference.
2021.5-sp04 (2021.5.0.155) and higher.
Product Defect ID: RECOB-3717
The below steps need to be followed when a Qlik Visibility repository database is moving to another server.
Update the $VISIBILITY_HOME/server/config/server.properties file for the repository sections to reflect the new server information – see below for the parameter that you need to adjust.
For Oracle, Verify that the TNS entry is updated that you can connect from the Process Server to the Visibility Repository and that ORACLE_HOME has been set for the Visibility OS account.
For DB2, Catalog the new Visibility Repository database on the Process Server and verify that you can connect from the Process Server to the Visibility Repository and that INSTHOME and DB2INSTANCE have been set for the Visibility OS account.
Server.Repository.Database=
Verify ORACLE_HOME and the INSTHOME and DB2INSTANCE is correct on the Visibility process server. If it is changing, make sure it is updated in the same in the following files:
.profile / .bash_profile,
$VISIBILITY_HOME/ env_server
$VISIBILITY_HOME/server/bin/ analyzer_daemon
$VISIBILITY_HOME/server/bin/collector_daemon
$VISIBILITY_HOME/server/bin/populator_daemon
$VISIBILITY_HOME/server/bin/purger_daemon
Restart all the 4 daemons to pick up the new TNS entry/INSTHOME/ DB2INSTANCE
cd $VISIBILITY_HOME/server/bin
./ analyzer_daemon restart
./collector_daemon restart
./populator_daemon restart
./purger_daemon restart
Update the Tomcat config (visibility. config) file to point to the new repository, restart Tomcat and then verify that it can connect to the Visibility URL
repository.name=NEW_TNS(INSTHOME)_HERE
Qlik Compose 2021.5.0.176
The following error would occur when generating adjust script as part of data warehouse validation.
[Metadata ] [ERROR] AdjustDWH Error: Index was out of range. Must be non-negative and less than the size of the collection.
Parameter name: index
[Metadata ] [ERROR] AdjustDWH Error: at System.ThrowHelper.ThrowArgumentOutOfRangeException(ExceptionArgument argument, ExceptionResource resource)
at System.Collections.Generic.List`1.get_Item(Int32 index)
Information provided on this defect is given as is at the time of documenting. For up-to-date information, please review the most recent Release Notes, or contact support with ID RECOB-3954 for reference.
2021.8.220 and higher
Product Defect ID: RECOB-3954
The following steps need to be followed when the Visibility monitored database is upgraded to a newer version. The below steps are for when Oracle Database is upgrade from 12C to 19C.
Back the current $VISIBILITY_HOME/server/config/server.properties file
Edit the Origin section of the $VISIBILITY_HOME/server/config/server.properties file and replace the below with the new connection information for the monitored Warehouse:
Server.Origin.Name.X=<warehousename> <---- The name of the monitored database name. This name must match the name in your license key.
Server.Origin.UserName.X=<repository-user>
Server.Origin.Password.X=<repository-password>
Server.Origin.Database.X=<repository-database> <---- The connection string to connect to the monitored database.
Edit the Cataloger section of the $VISIBILITY_HOME/server/config/server.properties file and replace the below:
Server.Origin.Name.X=devfc19 <--- The name of the monitored system being cataloged. This must match the name in your license key as well as the value for the Server.Origin.Names.X property.
Run the below catalog command:
cd $VISIBILITY_HOME/server/bin
./cataloger -sc -catalog <warehouse-database-name>
Note: It's best to run the above command using nohup option since the catalog process can take longer to finish based on the number of objects you have in the monitored database.
Like this: nohup ./cataloger -sc -catalog <warehouse-database-name> &