Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Search our knowledge base, curated by global Support, for answers ranging from account questions to troubleshooting error messages.
Talend v8 Big Data EMR task execution in HDFS configuration(Based on EMR 5.29) is hitting below issue with China region after doing migration to v8 from v7.
==Log==
org.apache.hadoop.fs.s3.S3Exception: org.jets3t.service.S3ServiceException: Service Error Message. -- ResponseCode: 403, ResponseStatus: Forbidden, XML Error Message: <?xml version="1.0" encoding="UTF-8"?><Error><Code>InvalidAccessKeyId</Code><Message>The AWS Access Key Id you provided does not exist in our records.</Message><AWSAccessKeyId>AKIATAYZAWDCxxx</AWSAccessKeyId><RequestId>HRZ7WJJFEREGFNGX</RequestId><HostId>2wOWmpLVjpjPfmclmQF8sZ6t3+QVjC1K8zzyyHbgphS==</HostId></Error>
==Log==
The current library jets3t used in Hadoop does not support the China region (cn-north-1). Due to some compatibility issues, even though the signature of S3 has been upgraded to V4, other regions of AWS are still using V2 version for avoiding these compatibility issues.
As the China region is a new region without such compatibility issues, so only V4 has been added in it.
defining-amazon-emr-connection-parameters-with-spark-universal
Internal defect ID: TBD-16745
It is possible that the following error may occur when executing a Job in Talend Studio version 8 that references a custom user routine.
java.lang.NoClassDefFoundError: routines/my-custom-routine
The error occurs due to the custom routine not being found at runtime, which obstructs the execution of the Job.
To resolve this issue, please uncheck the "Offline" checkbox in Talend Studio's Maven preferences (Window -> Preferences -> Maven). This enables Talend to download the required dependencies, thereby resolving the classpath conflict and enabling successful Job execution.
Asking questions using Insight Advisor Chat in the Hub on Qlik Sense Enterprise on Windows, may result in the message "Unable to get data" being returned. See Fig 1.
Verify the LEF file includes any of the following two attributes to be entitled for Insight Advisor Chat:
Then do below:
[nl-parser]
//Disabled=true
Identity=Qlik.nl-parser
[nl-app-search]
//Disabled=true
Identity=Qlik.nl-app-search
Qlik Sense Enterprise on Windows
This is a problem which on first impressions should not (and you would think logically cannot) happen. Therefore it is important to understand why it does, and what can be done to resolve it when it does.
The situation is that Replicate is doing a Full Load for a table (individually or as part of a task full loading many tables). The source and target tables have identical unique primary keys. There are no uppercasing or other character set issues relating to any of the columns that make up the key which may sometimes cause duplication problems. Yet as the Full Load for the table progresses, probably nearing the end, you get a message indicating that Replicate has failed to insert a row into the target as a result of a duplicate. That is there is already a row in the target table with the unique key for the row that it is trying to insert. The Full Load for that table is terminated (often after several hours); and if you try again the same error, perhaps for a different row, will often occur.
Logically this shouldn’t happen, but it does. The likelihood of it doing so depends on the source DBMS type, the type of columns in the source table, and you will find it is always for a table that is being updated (SQL UPDATEs) as Replicate copies it. The higher the update rate and the bigger the table, the more likely it is to happen.
Note: This article discussed the problems that are related to duplicates in the TARGET_LOAD and not the TARGET_APPLY, that is during Full Load and before starting to apply the cached changes.
To understand the fix we first need to understand why the problem occurs, and this involves understanding some of the internal workings of most conventional Relational Database Management Systems.
RDBMS’s tend to employ different terminology for things that exist in all of them. I’m going to use DB2 terminology and explain each term the first time I use it. With a different RDBMS the terminology may be different, but the concepts are generally the same?
The first concept to introduce is the Tablespace. That’s what it’s called in DB2, but it exists for all databases and is the physical area where the rows that make up the table are stored. Logically it can be considered as a single contiguous data area, split up into blocks, numbered in ascending order.
This is where your database puts the row data when you INSERT rows into the table. What’s also important is that it tries to update the existing data for a row in place when you do an UPDATE, but may not always be able to do so. If that is the case then it will move the updated row to another place in the tablespace, usually at what is then the highest used (the endpoint) block in the tablespace area.
The next point concerns how the DBMS decides to access data from the tablespace in resolving your SQL calls. Each RDBMS has an optimiser, or something similar that makes these decisions. The role of indexes with a relational database is somewhat strange. They are not really part of the standard Relational Database model, although in practice they are used to guarantee uniqueness and support referential integrity. Other than for these roles, they exist only to help the optimiser come up with faster ways of retrieving rows that satisfy your SELECT (database read) statements.
When any piece of SQL (we’ll focus on simple SELECT statements here) is presented to the optimiser, it decides on what method to use to search for and retrieve any matching rows from the tablespace. The default method is to search through all the rows directly in the tablespace looking for rows that match any selection criteria, this is known as a Tablespace Scan.
A Tablespace Scan may be the best way to access rows from a table, particularly if it is likely that many or most of the rows in the table will match the selection criteria. For other SELECTs though that are more specific about what row(s) are required, a suitable matching index may be used (if one exists) to go directly to the row(s) in the tablespace.
The sort of SQL that Replicate generates to execute against the source table when it is doing a Full Load is of the form SELECT * FROM, or SELECT col1, col2, … FROM. Neither of these has any row specific selection criteria, and in fact this is to be expected as a Full Load is in general intended to select all rows from the source table.
As a result the database optimiser is not likely to choose to use an index (even if a unique index on the table exists) to resolve this type of SELECT statement, and instead a Tablespace Scan of the whole tablespace area will take place. This, as you will see later, can be inconvenient to us but is in fact the fastest way of processing all the rows in the table.
When we do a Full Load copy for a table that is ‘live’ (being updated as we copy it), the result we end up with when the SELECT against the source has been completed and we have inserted all the rows into the target is not likely to be consistent with what is then in the source table. The extent of the differences is dependent on the rate of updates and how long the Full Load for that table takes. For high update rates on big tables that take many hours for a Full Load the extent of the differences can be quite considerable.
This all sounds very worrying but it is not as the CDC (Change Data Capture) part of Replicate takes care of this. CDC is mainly known for Replicating changes from source to target after the initial Full Load has been taken, keeping the target copies up to date and in line with the changing source tables. However CDC processing has an equally important role to play in the Full Load process itself, especially when this is being done on ‘live’ tables subject to updates as the Full Load is being processed.
In fact CDC processing doesn’t start when Full Load is finished, but in fact before Full Load starts. This is so that it can collect details of changes that are occurring at the source whilst the Full Load (and it’s associated SELECT statement) are taking place. The changes collected during this period are known as the ‘cached changes’ and they are applied to the newly populated target table before switching into normal ongoing CDC mode to capture all subsequent changes.
This takes care of and fixes all of the table row data inconsistencies that are likely to occur during a table Full Load, but there is one particular situation that can occur and catch us out before the Full Load completes and the cached changes can be applied. This results in Replicate trying to insert details for the same row more than once in the target table; triggering the duplicates error that we are talking about here.
Consider this situation:
That is how the problem occurs. Having variable length columns, and binary object columns in the source table make this (movement of the row to a new location in the tablespace) much more likely to happen and the duplicate insert problem to occur.
So how to fix this, or at least how to find a method to stop it happening.
The solution is to persuade the optimiser in the source database to use the unique index on the table to access the rows in the table’s tablespace rather than scanning sequentially through it. The index (which is unique) will only provide one row to read for each key as the execution of our SELECT statement progresses. We don’t have to worry about whether it is the ‘latest’ version of the row or not because that will be taken care of later by the application of the cached changes.
The optimiser can (generally) be persuaded to use the unique index on the source table if the SELECT statement indicates that there is a requirement to return the rows in the result set in the order given by that index. This requires having a SELECT statement with a order clause matching the columns in the unique index. Something of the form SELECT * FROM ORDER BY col1, col2, col3, etc. Where col1, col2, col3 etc. are the columns that make up the tables unique primary index.
But, how can we do this. Replicate has a undocumented facility that allows the user to configure extra text to be added to the end of the generated SQL for a particular table during Full Load processing specifically to add a WHERE statement to determine which rows are included and excluded during a Full Load extract.
This is not exactly what we want to do (we want to include all rows), but this ‘FILTER’ facility also provides the option to extend the content of the SELECT statement that is generated after the WHERE part of the statement has been added. So we can use it to add the ORDER BY part of the statement that we require.
Here is the format of the FILTER statement that you need to add.
—FILTER: 1=1) ORDER BY col1, col2, coln —
This is inserted in the ‘Record Selection Condition’ box on the individual table filter screen when configuring the Replicate task. If you want to do this for multiple tables in the Replicate task then you need to set up a FILTER for each table individually.
To explain, the —FILTER: keyword indicates the beginning of filter information that is expected to begin with a WHERE clause (which is generated automatically).
The 1=1)) component completes that WHERE clause in a way that all rows are selected (you could put in something to limit the rows selected if required, but that’s not what we are trying yo achieve here)
It is then possible to add other clauses and parameters before terminating the additional text to be added with the final —
In this case an ORDER clause is added that will guarantee that rows are returned in the order selected. This causes the unique index on the table to be used to retrieve rows at the source; assuming that you code col1, col2, etc. to match the columns and their order in the index. If the index has some columns in descending order (rather than ascending) make sure that is coded in the ORDER BY statement as well.
If you code things incorrectly the generated SELECT statement will fail and you will be able to see and debug this through the log.
After migrating a Spark Job from version 731 to version 801, the migrated Spark task execution generated an application log with a DEBUG level log. For some large Spark task executions, this generated up to 10GB of logs. The Spark Job design showed that the log4jLevel was unchecked by default.
The log configuration for both the spark.driver and spark.executor is not set by default, resulting in the Spark batch Job executing with DEBUG level by default.
In Run -> Spark Configuration ->Advanced properties (or in the wizard if using repository)
Add the property "spark.driver.extraJavaOptions" with value "-Dlog4j.configuration=/etc/spark/conf.cloudera.spark_on_yarn/log4j.properties"
Add the property "spark.executor.extraJavaOptions" with value "-Dlog4j.configuration=/etc/spark/conf.cloudera.spark_on_yarn/log4j.properties"
Note: /etc/spark/conf.cloudera.spark_on_yarn/log4j.properties is the default value provided on CDP, and you have the flexibility to customize the log levels as per your preference. This will result in altering the logger value when executed on Yarn.
Click here for Video Transcript
Note: The concept ’ UPSERT MODE’ and 'MERGE MODE' is not documented in the User Guide. i.e. it is not a word you can search for in the User Guide and is not a key word in the Replicate UI.
UPSERT MODE: Change an update to an insert if the row doesn't exists on the target
MERGE MODE: Change an insert to an update if the row already exists on the target
Use MERGE MODE: i.e. configure the task under: task setting --> Error Handling --> Apply Conflicts --> ‘Duplicate key when applying INSERT:’ UPDATE the existing target record
Use UPSERT MODE: i.e. configure the task under: task setting --> Error Handling --> Apply Conflicts --> ‘No record found for applying an UPDATE:’ INSERT the missing target record
Batch Apply and Transactional Apply modes:
There is a big difference in how these Upsert/Merge settings work depending of whether the task is in 'Batch' or 'Transactional' Apply mode.
Batch Apply mode:
Either option (Upsert/Merge) does an unconditional Delete of all rows in the batch, followed by an Insert of all rows.
Note: The other thing to note is that with this setting the actual update that fails is inserted in a way that may not be obvious and could cause issue with downstream processing. In batch apply mode the task will actually issue a pair of transactions (1st a delete of the record and then 2nd an insert) this pair of transactions is unconditional and will result in a "newly inserted row every time the record is updated on the source.
Transactional Apply mode:
Either option (Upsert/Merge) - the original statement is run and if it errors out then the switch is done (try and catch).
Insert in transactional apply mode, the insert statement will be performed in a "try / catch" fashion. The insert statement will be run and only if it fails will it be switched to an update statement .
In transactional apply mode, the update will be performed in a "try / catch" fashion. The update will be run and only if it fails will it be switched to an insert statement .
The information in this article is provided as-is and to be used at own discretion. Depending on tool(s) used, customization(s), and/or other factors ongoing support on the solution below may not be provided by Qlik Support.
The Qlik Sense Mobile app allows you to securely connect to your Qlik Sense Enterprise deployment from your supported mobile device. This is the process of configuring Qlik Sense to function with the mobile app on iPad / iPhone.
This article applies to the Qlik Sense Mobile app used with Qlik Sense Enterprise on Windows. For information regarding the Qlik Cloud Mobile app, see Setting up Qlik Sense Mobile SaaS.
Content:
See the requirements for your mobile app version on the official Qlik Online Help > Planning your Qlik Sense Enterprise deployment > System requirements for Qlik Sense Enterprise > Qlik Sense Mobile app
Out of the box, Qlik Sense is installed with HTTPS enabled on the hub and HTTP disabled. Due to iOS specific certificate requirements, a signed and trusted certificate is required when connecting from an iOS device. If using HTTPS, make sure to use a certificate issued by an Apple-approved Certification Authority.
Also check Qlik Sense Mobile on iOS: cannot open apps on the HUB for issues related to Qlik Sense Mobile on iOS and certificates.
For testing purposes, it is possible to enable port 80.
If not already done, add an address to the White List:
An authentication link is required for the Qlik Sense Mobile App.
NOTE: In the client authentication link host URI, you may need to remove the "/" from the end of the URL, such as http://10.76.193.52/ would be http://10.76.193.52
Users connecting to Qlik Sense Enterprise need a valid license available. See the Qlik Sense Online Help for more information on how to assign available access types.
Qlik Sense Enterprise on Windows > Administer Qlik Sense Enterprise on Windows > Managing a Qlik Sense Enterprise on Windows site > Managing QMC resource > Managing licenses
If a TCP connection is possible with Qlik's licensing server endpoint, testing the connection to license.qlikcloud.com will return the message default backend - 404 or 404 Not Found (nginx).
When testing whether or not your Sense installation can successfully connect to the license backend, always test the connection with all nodes.
The 404 HTTP error code indicates the server was reached but could not find any content to be displayed in the URL address specified.
To avoid a 404 message, rather than accessing license.qlikcloud.com, open license.qlikcloud.com/sld.
Another test would be to use telnet and confirm a connection to port 443 is possible:
If different results are returned:
There was an error when getting license information from the license server
Customer policy adopted injection via the reverse proxy of the Content Security Policy header for security reasons.
The policy adopted is basic: default-src 'self'
Opening the QlikView AccessPoint or Qlik Sense Hub may fail or the AccessPoint may only render partially.
The Browser Debug tools will provide more insight:
QlikView
Qlik Sense Enterprise on Windows
The Header Content Security Option contains a string of rules that informs the browser which resource/code is trusted to be loaded, executed rendered.
More details on the argument could be found here:
https://www.w3.org/TR/CSP3/ ,
For QlikView Accesspoint a first example is to use Content-Security-Policy: "default-src 'self' 'unsafe-inline' data: ;" ; (note that using 'unsafe-inline' option could be unsafe in a the proxy injection scenario when the client will brose a different site , you could/evaluate to use instead the sha256-hashcode version )
Further option could be necessary if for example you have QlikView Extension Object ( Server and Document Extensions) that are using external resources downloaded from CDN locations;
In this case the troubleshoot is the same use F12/Development Tools to check the resource that violates the policy and ad an exclusion.
QlikView Access Point Shows "Loading Content" Indefinitely,
What is CSP (Content-Security-Policy) and How does it Relate to Qlik?
This Techspert Talks session addresses:
Tip: Download the LogAnalyzer app here: LogAnalysis App: The Qlik Sense app for troubleshooting Qlik Sense Enterprise on Windows logs.
00:00 - Intro
01:22 - Multi-Node Architecture Overview
04:10 - Common Performance Bottlenecks
05:38 - Using iPerf to measure connectivity
09:58 - Performance Monitor Article
10:30 - Setting up Performance Monitor
12:17 - Using Relog to visualize Performance
13:33 - Quick look at Grafana
14:45 - Qlik Scalability Tools
15:23 - Setting up a new scenario
18:26 - Look QSST Analyzer App
19:21 - Optimizing the Repository Service
21:38 - Adjusting the Page File
22:08 - The Sense Admin Playbook
23:10 - Optimizing PostgreSQL
24:29 - Log File Analyzer
27:06 - Summary
27:40 - Q&A: How to evaluate an application?
28:30 - Q&A: How to fix engine performance?
29:25 - Q&A: What about PostgreSQL 9.6 EOL?
30:07 - Q&A: Troubleshooting performance on Azure
31:22 - Q&A: Which nodes consume the most resources?
31:57 - Q&A: How to avoid working set breaches on engine nodes?
34:03 - Q&A: What do QRS log messages mean?
35:45 - Q&A: What about QlikView performance?
36:22 - Closing
Resources:
LogAnalysis App: The Qlik Sense app for troubleshooting Qlik Sense Enterprise on Windows logs
Qlik Help – Deployment examples
Using Windows Performance Monitor
PostgreSQL Fine Tuning starting point
Qlik Sense Shared Storage – Options and Requirements
Qlik Help – Performance and Scalability
Q&A:
Q: Recently I'm facing Qlik Sense proxy servers RAM overload, although there are 4 nodes and each node it is 16 CPUs and 256G. We have done App optimazation, like delete duplicate app, remove old data, remove unused field...but RAM status still not good, what is next to fix the performace issue? Apply more nodes?
A: Depends on what you mean by “RAM status still not good”. Qlik Data Analytics software will allocate and use memory within the limits established and does not release this memory unless the Low Memory Limit has been reached and cache needs cleaning. If RAM consumption remains high but no other effects, your system is working as expected.
Q: Similar to other database, do you think we need to perform finetuning, cleaning up bad records within PostgresQL , e.g.: once per year?
A: Periodic cleanup, especially in a rapidly changing environment, is certainly recommended. A good starting point: set your Deleted Entity Log table cleanup settings to appropriate values, and avoid clean-up tasks kicking in before user morning rampup.
Q: Does QliKView Server perform similarly to Qlik Sense?
A: It uses the same QIX Engine for data processing. There may be performance differences to the extent that QVW Documents and QVF Apps are completely different concepts.
Q: Is there a simple way (better than restarting QS services)to clean the cache, because chache around 90 % slows down QS?
A: It’s not quite as simple. Qlik Data Analytics software (and by extent, your users) benefits from keeping data cached as long as possible. This way, users consume pre-calculated results from memory instead of computing the same results over and over. Active cache clearing is detrimental to performance. High RAM usage is entirely normal, based Memory Limits defined in QMC. You should not expect Qlik Sense (or QlikView) to manage memory like regular software. If work stops, this does not mean memory consumption will go down, we expect to receive and serve more requests so we keep as much cached as possible. Long winded, but I hope this sets better expectations when considering “bad performance” without the full technical context.
Q: How do we know when CPU hits 100% what the culprit is, for example too many concurrent user loading apps/datasets or mutliple apps qvds reloading? can we see that anywhere?
A: We will provide links to the Log Analysis app I demoed during the webinar, this is a great place to start. Set Repository Performance logs to DEBUG for the QRS performance part, start analysing service resource usage trends and get to know your user patterns.
Q: Can there be repository connectivity issues with too many nodes?
A: You can only grow an environment so far before hitting physical limits to communication. As a best practice, with every new node added, a review of QRS Connection Pools and DB connectivity should be reviewed and increased where necessary. The most usual problem here is: you have added more nodes than connections are allowed to DB or Repository Services. This will almost guarantee communication issues.
Q: Does qlik scalability tools measure browser rendering time as well or just works on API layer?
A: Excellent question, it only evaluates at the API call/response level. For results that include browser-side rendering, other tools are required (LoadRunner, complex to set up, expert help needed).
Transcript:
Hello everyone and welcome to the November edition of Techspert Talks. I’m Troy Raney and I’ll be your host for today's session. Today's presentation is Optimizing Performance for Qlik Sense Enterprise with Mario Petre. Mario why don't you tell us a little bit about yourself?
Hi everyone; good to be here with everybody once again. My name is Mario Petre. I’m a Principal Technical Engineer in the Signature Support Team. I’ve been with Qlik over six years now and since the beginning, I’ve focused on Qlik Sense Enterprise backend services, architecture and performance from the very inception of the product. So, there's a lot of historical knowledge that I want to share with you and hopefully it's an interesting springboard to talk about performance.
Great! Today we're going to be talking about how a Qlik Sense site looks from an architectural perspective; what are things that should be measured when talking about performance; what to monitor after going live; how to troubleshoot and we'll certainly highlight plenty of resources and where to find more details at the end of the session. So Mario, we're talking about performance for Qlik Sense Enterprise on Windows; but ultimately, it's software on a machine.
That's right.
So, first we need to understand what Qlik Sense services are and what type of resources they use. Can you show us an overview from what a multi-node deployment looks like?
Sure. We can take a look at how a large Enterprise environment should be set up.
And I see all the services have been split out onto different nodes. Would you run through the acronyms quickly for us?
Yep. On a consumer node this is where your users come into the Hub. They will come in via the Qlik Proxy Service and consume applications via the Qlik Engine Service, that ultimately connects to the central node and everything else via the Qlik Repository Service.
Okay.
The green box is your front-end services. This is what end users tap into to consume data, but what facilitates that in the background is always the Repository Service.
And what's the difference between the consumer nodes on the top and the bottom?
These two nodes have a Proxy Service that balances against their own engines as well as other engines; while the consumer nodes at the bottom are only there for crunching data.
Okay.
And then we can take a look at the backend side of things. Resources are used to the extent that you're doing reloads, you will have an engine there as well as the primary role for the central node, active and failover which is: the Repository Service to coordinate communication between all the rest of the services. You can also have a separate node for development work. And ultimately we also expect the size of an environment to have a dedicated storage solution and a dedicated central Repository Database host either locally managed or in one of the cloud providers like AWS RDS for example.
Between the front-end and back-end services where's the majority of resource consumption, and what resources do they consume?
Most of the resource allocation here is going to go to the Engine Service; and that will consume CPU and RAM to the extent that it's allocated to the machine. And that is done at the QMC level where you set your Working Set Limits. But in the case of the top nodes, the Proxy Service also has a compute cost as it is managing session connectivity between the end user's browser and the Engine Service on that particular server. And the Repository Service is constantly checking the authorization and permissions. So, ultimately front-end servers make use of both front-end and back-end resources. But you also need to think about connectivity. There is the data streaming from storage to the node where it will be consumed and then loading from that into memory. And these are three different groups of resources: you have compute; you have memory, and you have network connectivity. And all three have to be well suited for the task for this environment to work well.
And we're talking about speed and performance like, how fast is a fast network? How can we even measure that?
So, we would start for any Enterprise environment, we would start at a 10 Gb network speed and ultimately, we expect response time of 4 MS between any node and the storage back end.
Okay. So, what are some common bottlenecks and issues that might arise?
All right. So, let's take a look at some at some examples. The Repository Service failing to communicate with rim nodes, with local services. I would immediately try to verify that the Repository Service connection pool and network connectivity is stable and connect. Let's say apps load very very slow for the first time. This is where network speed really comes into play. Another example: the QMC or the Hub takes a very very long time to load. And for that, we would have to look into the communication between the Repository Service and the Database, because that's where we store all of this metadata that we will try to calculate your permissions based on.
And could that also be related to the rules that people have set up and the number of users accessing?
Absolutely. You can hurt user experience by writing complex rules.
What about lag in the app itself?
This is now being consumed by the Engine Service on the consumer node. So, I would immediately try to evaluate resource consumption on that node, primarily CPU. Another great example for is high Page File usage. We prefer memory for working with applications. So, as soon as we try to cache and pull those results again from disk, performance we'll be suffering. And ultimately, the direct connectivity. How good and stable is the network between the end users machine and the Qlik Sense infrastructure? The symptom will be on the end user side, but the root cause is almost always (I mean 99.9% of the time) will be down to some effect in the environment.
So, to get an understanding of how well the machine works and establish that baseline, what can we use?
One simple way to measure this (CPU, RAM, disk network) is this neat little tool called iPerf.
Okay. And what are we looking at here?
This is my central node.
Okay. And iPerf will measure what exactly?
How fast data transfer is between this central node and a client machine or another server.
And where can people find iPerf?
Great question. iPerf.fr
And it's a free utility, right?
Absolutely.
So, I see you've already got it downloaded there.
Right. You will have to download this package, both on the server and the client machine that you want to test between. We'll run this “As Admin.” We call out the command; we specify that we want it to start in “server mode.” This will be listening for connection attempts.
Okay.
We can define the port. I will use the default one. Those ports can be found in Qlik Help.
Okay.
The format for the output in megabyte; and the interval for refresh 5 seconds is perfectly fine. And then, we want as much output as possible.
Okay.
First, we need to run this. There we go. It started listening. Now, I’m going to switch to my client machine.
So, iPerf is now listening on the server machine and you're moving over to the client machine to run iPerf from there?
Right. Now, we've opened a PowerShell window into iPerf on the client machine. Then we call the iPerf command. This time, we're going to tell it to launch in “Client Mode.” We need to specify an IP address for it to connect to.
And that's the IP address of the server machine?
Right. Again, the port; the format so that every output is exactly the same. And here, we want to update every second.
Okay.
And this is a super cool option: if we use the bytes flag, we can specify the size of the data payload. I’m going to go with a 1 Gb file (1024 Mb). You can also define parallel connections. I want 5 for now.
So, that's like 5 different users or parallel streams of activity of 1 Gb each between the server machine and this client machine?
Right. So, we actually want to measure how fast can we acquire data from the Qlik Sense server onto this client machine. We need to reverse the test. So, we can just run this now and see how fast it performs.
Okay. And did the server machine react the same way?
You can see that it produced output on the listening screen. This is where we started. And then it received and it's displaying its own statistics. And if you want to automate this, so that you have a spot check of throughput capacity between these servers, we need to use the log file option. And then we give it a path. So, I’m gonna say call this “iperf_serverside…” And launch it. And now, no output is produced.
Okay.
So, we can switch back to the client machine.
Okay. So, you're performing the exact same test again, just storing everything in a log file.
The test finished.
Okay. So, that can help you compare between what's being sent to what's being received, and see?
Absolutely. You can definitely have results presented in a way that is easy to compare across machines and across time. And initial results gave us a throughput per file of around 43.6, 46, thereabouts megabytes per second.
So, what about for an end user who's experiencing issues? Can you use iPerf to test the connectivity from a user machine on a different network?
Yep. So, in in the background we will have our server; it's running and waiting for connections. And let's run this connection now from the client machine. We will make sure that the IP address is correct; default port; the output format in megabytes; we want it refreshed every second; and we are transferring 1 Gb; and 5 parallel streams in reverse order. Meaning: we are copying from the server to the client machine. And let's run it.
Just seeing those numbers, they seem to be smaller than what we're seeing from the other machine.
Right. Indeed. I have some stuff in between to force it to talk a little slower. But this is one quick way to identify a spotty connection. This is where a baseline becomes gold; being able to demonstrate that your platform is experiencing a problem. And to quantify and to specify what that problem is going to reduce the time that you spend on outages and make you more effective as an admin.
Okay. That was network. How can admins monitor all the other performance aspects of a deployment? What tools are available and what metrics should they be measuring?
Right. That's a great question. The very basic is just Performance Monitor from Windows.
Okay.
The great thing about that is that we provide templates that also include metrics from our services.
Can you walk us through how to set up the Performance Monitor using one of those templates?
Sure thing. So, we're going to switch over first to the central node. So, the first thing that I want to do is create a folder where all of these logs will be stored.
Okay. So, that's a shared folder, good.
And this article is a great place to start. So, we can just download this attachment
So, now it's time to set up a Performance Monitor proper. We need to set up a new Data Collector Set.
Giving it a name.
And create from template. Browse for it, and finish.
Okay. So it’s got the template. That's our new one Qlik Sense Node Monitor, right?
Yep. You'll have multiple servers all writing to the same location. The first thing is to define the name of each individual collector; and you do that here. And you can also provide subdirectory for these connectors, and I suggest to have one per node name. I will call this Central Node.
Everything that comes from this node, yeah.
Correct. You can also select a schedule for when to start these. We have an article on how to make sure that Data Collectors are started when Windows starts. And then a stop condition.
Now, setting up monitors like this; could this actually impact performance negatively?
There is always an overhead to collecting and saving these metrics to a file. But the overhead is negligible.
Okay.
I am happy with how this is defined. Now, this static collector on one of the nodes is already set up. There is an option here that's called Data Manager. What's important here to define is to set a Minimum Free Disk. We could go with 10 Gb, for example; and you can also define a Resource Policy. The important bit is Minimum Free Disk. We want to Delete the Oldest (not the largest) in the Data Collector itself. We should change that directory and make sure that it points to our central location instead of locally; and we'll have to do this for every single node where we set this up.
Okay. So, that's that shared location?
Yep.
And you run the Data Collector there. And it creates a CSV file with all those performance counters. Cool.
So, here we have it now. If we just take a very quick look inside, we'll see a whole bunch of metrics. And if you want to visualize these really really quick, I can show you a quick tip that wasn't on the agenda but since we're here: on Windows, there is a built-in tool called Relog that is specifically designed for reformatting Performance Monitor counters. So, we can use Relog; we'll give it the name of this file; the format will be Binary; the output will be the same, but we'll rename it to BLG; and let's run it.
And now it created a copy in Binary format. Cool thing about this Troy is that: you can just double click on it.
It's already formatted to be a little more readable. Wow! Check that out.
There we go. Another quick tip: since we're here, first thing to do is: select everything and Scale; just to make sure that you're not missing any of the metrics. And this is also a great way to illustrate which service counters and system counters we collect. As you can see, there's quite a few here.
Okay. So, that Performance Monitor is, it's set up; it's running; we can see how it looks; and that is going to run all the time or just when we manually trigger it?
You can definitely configure it to run all the time, and that would be my advice. Its value is really realized as a baseline.
Yeah. Exactly. That was pretty cool seeing how that worked, using all the built-in utilities. And that Relog formatting for the Process Monitor was new to me. Are there any other tools you like to highlight?
Yeah. So, Performance Monitor is built-in. For larger Enterprises that may already be monitoring resources in a centralized way, there's no reason why you shouldn't expect to include the Sense resources into that live monitoring. And this could be done via different solutions out there. A few come to mind like: Grafana, Datadog, Butler SOS, for example from one of our own Qlik luminaries.
Can we take a quick look at Grafana? I’ve heard of that but never seen it.
Sure thing. This is my host monitor sheet. It's nowhere built to a corporate standard, but you can see here I’m looking at resources for the physical host where these VMs are running as well as the domain controller, and the main server where we've been running our CPU tests. And the great part about this is I have historical data as far back I believe as 90 days.
So, this is a cool tool that lets you like take a look at the performance and zoom-in and find the processes that might be causing some peaks or anything you want to investigate?
Right. Exactly. At least come up with a with a narrow time frame for you to look into the other tools and again narrow down the window of your investigation.
Yeah, that could be really helpful. Now I wanted to move on to the Qlik Sense Scalability Tools. Are those available on Qlik community?
That's right. Let me show you where to find them. You can see that we support all current versions including some of the older ones. You will have to go through and download the package and the applications used for analysis afterwards. There is a link over here. So, once the package is downloaded, you will get an installer. And the other cool thing about Scalability Tools is that you can use it to pre-warm the cache on certain applications since Qlik Sense Enterprise doesn't support application pre-loading.
Oh, cool. So, you can throttle up applications into memory like in QlikView. Can we take a look at it?
Yes, absolutely. This is the first thing that you'll see. We'll have to create a new connection. So, I’ll open a simple one that I’ve defined here and we can take a look at what's required just to establish a quick connection to your Qlik Sense site.
Okay, but basically the scenario that you're setting up will simulate activity on a Qlik Sense site to test its performance?
Exactly. You'll need to define your server hostname. This can be any of your proxy nodes in the environment. The virtual proxy prefix. I’ve defined it as Header and authentication method is going to be WebSocket.
Okay.
And then, if we want to look at how virtual users are going to be injected into the system, scroll over here to the user section. Just for this simple test, I’ve set it up for User List where you can define a static list of users like so: User Directory and UserName.
Okay. So, it's going to be taking a look at those 2 users you already predefined and their activity?
Exactly. We need to test the connection to make sure that we can connect to the system. Connection Successful. And then we can proceed with the scenario. This is very simple but let me show you how I got this far. So, the very first thing that we should do is to Open an App.
So, you're dragging away items?
Yep. I’m removing actions from this list. Let's try to change the sheet. A very simple action. And now we have four sheets, and we'll go ahead and select one of them.
Okay, so far, we have Opening the App and immediately changing to a sheet?
Yep. That's right. This will trigger actions in sequence exactly how you define them. It will not take into consideration things like Think Time. I will just define a static weight of 15 seconds, and then you can make selections.
But this is an amazing tool for being able to kind of stress test your system.
It's very very useful and it also provides a huge amount of detail within the results that it produces. One other quick tip: while defining your scenario, use easy to read labels, so that you can identify these in the Results Application. Let's assume that the scenario is defined. We will go ahead and add one last action and that is: to close, to Disconnect the app. We'll call this “OpenApp.” We'll call this “SheetChange.” Make sure you Save. The connection we've tested; we've defined our list of users. First, let's run the scenario. There is one more step to define and that is: to configure an Executor that will use this scenario file to launch a workload against our system. Create a New Sequence.
This is just where all these settings you're defining here are saved?
Correct. This is simply a mapping between the execution job that you're defining and which script scenario should be used. We'll go ahead and grab that. Save it again; and now we can start it. And now in the background if we were to monitor the Qlik Sense environment, we would see some amount of load coming in. We see that we had some kind of issue here: empty ObjectID. Apparently I left something in the script editor; but yeah, you kind of get the idea.
So, all this performance information would then be loaded into an app that is part of the package downloaded from Qlik community. How does that look?
So, here you will see each individual result set, and you can look at multiple-exerciser runs in the single application. Unfortunately, we don't have more than one here to showcase that, but you would see multiple-colored lines. There is metrics for a little bit of everything: your session ramp, your throughput by minute, you can change these.
CPU, RAM. This is great.
Exactly. CPU and RAM. These are these are not connected. We don't have those logs, but you would have them for a setup run on your system. These come from Performance Monitor as well, so you could just use those logs provided that the right template is in place. We see Response Time Distribution by Action, and these are the ones that I’ve asked you to change and name so that they're easy to understand.
Once your deployment is large enough to need to be multi-node and the default settings are no longer the best ones for you, what needs to be adjusted with a Repository Service to keep it from choking or to improve its performance?
That's a great question Troy. So, the first thing that we should take a look at is how the Repository communicates with the backend Database and vice versa. The connection pool for the Repository is always based on core count on the machine. And the best rule of thumb that we have to date is to take your core count on that machine, multiply it by 5, and that will be the max connection pool for the Repository Service for that node.
Can you show us where that connection pool setting can be changed?
Yes. So, we will go ahead and take a look. Here we are on the central node of my environment. You'll have to find your Qlik installation folder. We'll navigate to the Repository folder, Util, QlikSenseUtil, and we'll have to launch this “As Admin.”
Okay.
We'll have to come to the Connection String Editor. Make sure that the path matches. We just have to click on Read so that we get the contents of these files. And the setting that we are about to change is this one.
Okay. So, the maximum number of connections that the Repository can make?
Yes. And this is (again) for each node going towards the Repository Database.
Okay.
Again, this should be a factor of CPU cores multiplied by 5. If 90 is higher than that result, leave 90 in place. Never decrease it.
Okay, that's a good tip.
Right. I change this to 120. I have to Save. What I like to do here is: clear the screen and hit Read again; just to make sure that the changes have been persisted in the file.
Okay.
Once that's done, we can close this. We can restart the environment. We can get out of here.
So, there you adjusted the setting of how many connections this node can make to the QSR. Then assuming we do the same on all nodes, where do we adjust the total number of connections the Repository itself can receive?
That should be a sum of all of the connection strings from all of your nodes plus 110 extra for the central node. By default, here is where you can find that config file: Repository, PostgreSQL, and we'll have to open this one, PostgreSQL. Towards the end of the file…
Just going all the way to the bottom.
Here we have my Max Connections is 300.
Okay. One other setting you mentioned was the Page File and something to be considered. How would we make changes or adjust that setting?
Right. So, this is a Windows level setting that's found in Advanced System Settings; Advanced tab; Performance; and then again Advanced; and here we have Virtual Memory.
Okay.
We have to hit Change. We'll have to leave it at System Managed or understand exactly which values we are choosing and why. If you're not sure, the default should always be System Managed.
Now, I want to know what resources are available for Qlik Sense admins; specifically, what is the Admin Playbook?
It's a great starting place for understanding what duties and responsibilities one should be thinking about when administering a Qlik Sense site.
So, these are a bunch of tools built by Qlik to help analyze your deployment in different ways. I see weekly, monthly, quarterly, yearly, and a lot of different things are available there.
Yeah. So, we can take a look at Task Analysis, for example. The first time you run it, it's going to take about 20 minutes; thereafter about 10. The benefits: it shows you really in depth how to get to the data and then how to tweak the system to work better based on what you have.
Yeah, that's great.
Right? So, not only we put the tools in your hands, but also how to build these tools as you can here. See here, we have instructions on how to come up with these objects from scratch. An absolute must-read for every system admin out there.
Mario, we've talked about optimizing the Qlik Sense Repository Service, but not about Postgres? Do larger Enterprise level deployments affect its performance?
Sure. The thing about Postgres is again: we have to configure it by default for compatibility and not performance. So, it's another component that has to be targeted for optimization.
The detail there that anything over 1 Gb from Postgres might get paged - that sounds like it could certainly impact performance.
Right, because the buffer setting that we have by default is set to 1 Gb; and that means only 1 Gb of physical memory will be allocated to Postgres work. Now, we're talking about the large environment 500 to maybe 5,000 apps. We're talking 1000s of users with about 1000 of them peak concurrency per hour.
So, can we increase that Shared Buffer setting?
Absolutely. And in fact, I want to direct you to a really good article on performance optimization for PostgreSQL. And when we talk about fine-tuning, this article is where I’d like to get started. We talk about certain important factors like the Shared Buffers. So, this is what we define to 1 Gb by default. Their recommendation is to start with 1/4 of physical memory in your system. 1 Gb is definitely not one quarter of the machines out there. So, it needs tweaking.
And again these are settings to be changed on the machine that's hosting the Repository Database, right?
That's correct. That's correct.
Now, is there an app that you're aware of that would be good to kind of look at all these logs and analyze what's going on with the performance?
Absolutely. This is an application that was developed to better understand all of the transactions happening in a particular environment. It reads the log files collected with the Log Collector either via the tool or the QMC itself.
Okay.
It's not built for active monitoring, but rather to enhance troubleshooting.
Sure. So, basically it's good for looking at a short period of time to help troubleshooting?
Right. The Repository itself communicates over APIs between all the nodes and keeps track of all of the activities in the system; and these translate to API calls. If we want to focus on Repository API calls, we can start by looking at transactions.
Okay.
So, this will give us detail about cost. For example, per REST call or API call, we can see which endpoints take the most, duration per user, and this gives you an opportunity to start at a very high level and slowly drill in both in message types and timeframe. Another sheet is the Threads Endpoints and Users; and here you have performance information about how many worker-threads the Repository Service is able to start, what is the Repository CPU consumption, so you can easily identify one. For example, here just by discount, we can see that the preview privileges call for objects is called…
Yeah, a lot.
Over half a million times, right? And represents 73% of the CPU compute cost.
Wow, nice insights.
And then if we look here at the bottom, we can start evaluating time-based patterns and select specific time frames and go into greater detail.
So, I’m assuming this can also show resource consumption as well?
Right. CPU, memory in gigabytes and memory in percent. One neat trick is: to go to the QMC, look at how you've defined your Working Set Limits, and then pre-define reference lines in this chart. So, that it's easier to visualize when those thresholds are close to being reached or breached. And you do that by the add-ons reference lines, and you can define them like this.
That's just to sort of set that to match what's in the QMC?
Exactly.
Makes a powerful visualization. So, you can really map it.
Absolutely. And you can always drill down into specific points in time we can go and check the log details Engine Focus sheet; and this will allow us to browse over time, select things like errors and warnings alone, and then we will have all of the messages that are coming from the log files and what their sources.
Yeah. That's great to have it all kind of collected here in one app, that's great.
Indeed.
To summarize things, we've talked about to understand system performance, a baseline needs to be established. That involves setting up some monitoring. There are lots of options and tools available to do that; and it's really about understanding how the system performs so the measurement and comparisons are possible if things don't perform as expected.
And to begin to optimize as well.
Okay, great. Well now, it's time for Q&A. Please submit your questions through the Q&A panel on the left side of your On24 console. Mario, which question would you like to address first?
We have some great questions already. So, let's see - first one is: how can we evaluate our existing Qlik Sense applications?
This is not something that I’ve covered today, but it's a great question. We have an application on community called App Metadata Analyzer. You can import this into your system and use it to understand the memory footprint of applications and objects within those applications and how they scale inside your system. It will very quickly illustrate if you are shipping applications with extremely large data files (for example) that are almost never used. You can use that as a baseline for both optimizing local applications and also in your efforts to migrating to SaaS, if you feel like you don't want to bother with all of this Performance Monitoring and optimization, you can always choose to use our services and we'll take care of that for you.
Okay, next question.
So, the next question: worker schedulers errors and engine performance. How to fix?
I think I would definitely point you back to this Log Analysis application. Load that time frame where you think something bad happened, and see what kind of insights you can you can get by playing with the data, by exploring the data. And then narrow that search down if you find a specific pattern that seems like the product is misbehaving. Talk to Qlik support. We'll evaluate that with you and determine whether this is a defect or not or if it's just a quirk of how your system is set up. But that Sense Log Analysis app is a great place to start. And going back to the sheet that I showed: Repository and Engine metrics are all collected there. And these come from the performance logs that we already produce from Qlik Sense. You don't need to load any additional performance counters to get those details.
Okay.
All right. So, there is a question here about Postgres 9.6 and the fact that it's soon coming end of life. And I think this is a great moment to talk about this. Qlik Sense client-managed or Qlik Sense Enterprise for Windows supports Postgres 12.5 for new installations since the May release. If you have an existing installation, 9.6 will continue to be used; but there is an article on community on how to in-place upgrade that to 12.5 as a standalone component. So, you don't have to continue using 9.5 if your IT policy is complaining about the fact that it's soon coming to the end of life. As we say, we are aware of this fact; and in fact, we are shipping a new version as of the May 2021 release.
Oh, great.
So, here's an interesting question. If we have Qlik Sense in Azure on a virtual machine, why is the performance so sluggish? How do you fine-tune it? I guess first we need to understand what would you mean by sluggish? But the first thing that I want to point to is: different instance types. So, virtual machines in virtual private cloud providers are optimized for different workloads. And the same is true for AWS, Azure and Google Cloud platform. You will have virtual machines that are optimized for storage; ones that are optimized for compute tasks or application analytics; some that are optimized for memory. Make sure that you've chosen the right instance type and the right level of provisioned iOps for this application. If you feel that your performance is sluggish, start increasing those resources. Go one tier up and reevaluate until you find a an instance type that works for you. If you wish to have these results (let's say beforehand), you will have to consider using the Scalability Tools together with some of your applications against different instance types in Azure to determine which ones work best.
Just to kind of follow up on that question, if we're looking at that multi-node example from Qlik help, what nodes would you consider would require more resources?
Worker nodes in general. And those would be front and back-end.
So, a worker node is something with an engine, right?
Exactly. Something with an engine. It can either be front-facing together with a proxy to serve content, or back-end together with a scheduler a service to perform reload tasks. These will consume all the resources available on a given machine.
Okay.
And this is how the Qlik Sense engine is developed to work. And these resources are almost never released unless there is a reason for it, because us keeping those results cached is what makes the product fast.
Okay.
Oh, here's a great one about avoiding working set breaches on engine nodes. Question says: do you have any tips for avoiding the max memory threshold from the QIX engine? We didn't really cover this this aspect, but as you know the engine allows you to configure memory limits both for the lower and higher memory limit. Understanding how these work; I want to point you back to that QIXs engine white paper. The system will perform certain actions when these thresholds are reached. The first prompt that I have for you in this situation is: understand if these limits are far away from your physical memory limit. By default, Qlik Sense (I believe) uses 70 / 90 as the low and high working sets on a machine. With a lot of RAM, let's say 256 - half a terabyte of RAM, if you leave that low working set limit to 70 percent, that means that by default 30 of your physical RAM will not be used by Qlik Sense. So. always keep in mind that these percentages are based on physical amount of RAM available on the machine, and as soon as you deploy large machines (large: I’m talking 128 Gb and up) you have to redefine these parameters. Raise them up so that you utilize almost all of the resources available on the machine ,and you should be able to visualize that very very easily in the Log Analysis App by going to Engine Load sheet and inserting those reference lines based on where your current working sets are. Of course, the only way really to avoid a working set limit issue is to make sure that you have enough resources. And the system is configured to utilize those resources, so even if you still get them after raising the limit and allowing the - allowing the product to use as much RAM as it can without of course interfering with Windows operations (which is why you should never set these to like 99, 98, 99). Windows needs RAM to operate by itself, and if we let Qlik Sense to take all of it, it will break things. If you've done that and you're still having performance issues, that means you need more resources.
Yeah. It makes sense.
Oh, so here is another interesting question about understanding what certain Qlik Repository Service (QRS) log messages say. There is a question here that says: try to meet the recommendation of network and persistence the network latency should be less than 4 MS, but consistently in our logs we are seeing the QRS security management retrieved privileges in so many milliseconds. Could this be a Repository Service issue or where would you suggest we investigate first? This is an info level message that you are reporting. And it's simply telling you how long it took for the Repository Service to compute the result for that request. That doesn't mean that this is how long it took to talk to the Database and back, or how long it took for the request to reach from client to the server; only how long it took for the Repository Service to look up the metadata look up the security rules and then return a result based on that. And I would say this coming back in 384 milliseconds is rather quick. It depends on how you've defined these security rules. If these security rules are super simple and you are still getting slow responses, we would definitely have to look at resource consumption. But if you want to know how these calls affect resource consumption on the Repository and Postgres side, go back to that Log Analysis App. Raise your Repository performance logs in the QMC to Debug levels so that you get all of the performance information of how long each call took to execute. And try to establish some patterns. See if you have calls that take longer to execute than others; and where are those coming from any specific apps, any specific users? All of these answers come from drilling down into the data via that app that I demoed.
Okay Mario, we have time for one last question.
Right. And I think this is an excellent one to end. We talked a whole bunch here about Qlik Sense, but all of this also applies to QlikView environments. We are always looking at taking a step back and considering all of the resources that are playing in the ecosystem, not just the product itself. And the question asks: is QlikView Server performance similar to how it handles resources Qlik Sense? The answer is: yes. The engine is exactly the same in both products. If you read that white paper, you will understand how it works in both QlikView and Qlik Sense. And the things that you should do to prepare for performance and optimization are exactly the same in both products. Excellent question.
Great. Well, thank you very much Mario!
Oh, it's been my pleasure Troy. That was it for me today. Thank you all for participating. Thank you all for showing up. Thank you Troy for helping me through this very very complicated topic. It's been a blast as always. And to our customers and partners, looking forward to seeing your questions and deeper dives into logs and performance on community.
Okay, great! Thank you everyone! We hope you enjoyed this session. Thank you to Mario for presenting. We appreciate getting experts like Mario to share with us. Here's our legal disclaimer and thank you once again. Have a great rest of your day.
Studio is starting with the incorrect JDK even though it's specifically set in JAVA_HOME and the <Studio Home>\Talend-Studio-win-x86_64.ini file.
Please edit and update the -vm option contained within the startup shortcut with correct JDK.
If Studio is launched with a startup shortcut, it may contain a -vm option pointing to a different JDK(old JDK) than the correct one which is already set in the ini file or JAVA_HOME variable.
setting-up-java_home-for windows
When using Talend Studio, you may encounter the following error while attempting to open a remote project or apply a patch update:
javax.net.ssl.SSLHandshakeException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
This error typically arises due to a missing or untrusted certificate in the Java keystore required to establish a secure connection. This article provides step-by-step instructions to resolve this issue in two common scenarios:
Verify Java Version:
Download and Install the Required CA Certificate:
Import the CA Certificate into Java Keystore:
Locate the cacerts file (typically at <JAVA_HOME>/jre/lib/security/cacerts).
Open a command prompt or terminal and navigate to the directory containing your cacerts file.
Use the keytool command to import the certificate:
keytool -import -alias <alias_name> -file <path_to_certificate_file> -keystore <path_to_your_jdk>/jre/lib/security/cacerts
Replace:
When prompted, enter the keystore password (default is changeit).
Verify the Import:
List the certificates in the cacerts file using:
keytool -list -v -keystore <path_to_your_jdk>/jre/lib/security/cacerts
Ensure the imported certificate appears in the list.
Verify Java Version:
Download Required CA Certificate:
Import the Certificate into Java Keystore:
Configure Talend Studio to Use the Updated Keystore:
Add the following JVM argument to your Talend Studio talend.ini file:
-Djavax.net.ssl.trustStore=<path_to_your_jdk>/jre/lib/security/cacerts
Replace <path_to_your_jdk> with the path to your JDK installation.
If you are operating behind a corporate proxy, you may need to import the proxy server's SSL certificate into your Java keystore. Similarly, any specific corporate firewall rules that could be blocking connections to Talend Cloud should be reviewed and configured to allow necessary traffic.
This article aims to explain how SQL server T-log cleans up works when Microsoft replication\publication is enabled on the database.
When the Qlik Replicate task first time runs to capture CDC, Qlik Replicate will create a publication on the database with required articles. As part of this publication log reader agent job also will be created and this job will continuously run to mark replicated transactions on the database.
Apart from the Replicate process, there will be a transactional log backup job that will run every 15 mins or 30 mins depends on source team policy. As part of this log backup job, all the transactions will be backup up to that point in time and truncate all replicated and committed transactions from T-log.
Assume, there is a scheduled t-log backup job going to run at 10 am, and the replicate task is reading transaction log with 5 mins latency, there is a high possibility that the backup job will remove the transaction(s) from a transactional log which hasn't read by the Qlik Replicate. In this scenario, the Qlik Replicate task will be failed with a missing LSN error.
To prevent such kinds of issues Qlik Replicate implemented an option to hold T-log for a couple of mins without truncating based on the below setting:
Qlik Replicate creates an internal table called attrep_truncation_safeguard on the source database and always runs two update queries (2 update queries for each Qlik Replicate task running on the database) without commit (called Latch Lock A and B), only when you enable Start transactions in the database setting on source SQL endpoint. Qlik Replicate will update the time on these queries every 5 mins by default and we can control time by using an Option called "Apply TLOG truncation prevention policy every (seconds): ".
Here are the screenshots to explain how to check these open transactions on the database.
The eclipse OSGi framework was unable to promptly identify the installed CommandLine module.
To resolve this issue, a retry mechanism should be implemented to ensure its functionality.
Add -Dinstall.org.eclipse.equinox.p2.transport.ecf.retry=10 after mvn command.
mvn org.talend.ci:builder-maven-plugin:8.0.16:generateAllPoms \
-Dtalend.studio.p2.update='/opt/Patch_20240621_R2024-06_v1-8.0.1' \
-Dlicense.path='/opt/license' \
-Dinstaller.clean=true \
-Dstudio.error.on.component.missing=false \
-Dinstall.org.eclipse.equinox.p2.transport.ecf.retry=10 \
-s ${TALEND_SETTINGS_XML}
Talend Managemnet Console supports SingleSignOn(SSO) and integrates with several SSO platforms. In this exercise, you will activate SSO by linking your TMC with Okta which is a third-party enterprise-grade identity management service, built for the cloud, but compatible with many on-premises applications.
Check your email inbox and click on the confirmation link to activate the Okta Account
Connect to your Okta organization and add Talend Cloud as a new SSO-enabled application.
The TalendCloudDomainName attribute indicates your Talend Cloud domain. You can find the domain name in the Domain field of the Subscription page of your Talend Management Console. The NameId Format attribute indicates the email address format.
Once you set and Create the Talend Cloud SAML application, you can see the Icon created in OKTA My Apps
And TADA! You are connected to Talend. Check that your user has the roles and types you have set.
You must have the Security Administrator role in Talend Management Console and have the metadata file obtained from the SSO provider
Free Trial OKTA Account
Talend Management Center
creating-talend-cloud-application-in-okta
When customer is trying to connect to Mysql 8 DB on Talend 8 R2024-05 it is giving the below error:
===============
javax.net.ssl.SSLException: closing inbound before receiving peer's close_notify
at java.base/sun.security.ssl.SSLSocketImpl.shutdownInput(SSLSocketImpl.java:842)
at java.base/sun.security.ssl.SSLSocketImpl.shutdownInput(SSLSocketImpl.java:821)
===============
Customer is using MySQL Driver: mysql-connector-java-8.0.12.jar
It seems that it is a bug with the MySQL driver. It looks like sql drivers 8.0.16 and below are affected.
https://bugs.mysql.com/bug.php?id=93590
https://dev.mysql.com/doc/relnotes/connector-j/8.0/en/news-8-0-16.html
https://bugs.mysql.com/bug.php?id=93590
https://dev.mysql.com/doc/relnotes/connector-j/8.0/en/news-8-0-16.html
This article supplements documentation that requires changes to the Qlik Sense Engine Settings.ini. No settings are provided in this article.
[Settings 7]
Key=Value
If you are looking to modify the Qlik Sense Desktop client settings.ini:
Qlik Enterprise Manager (QEM) fails to monitor Qlik Replicate.
The following issues can be observed:
Possible error messages:
This is a known issue and Qlik is actively working on a patch. Please review the release notes for QB-26321 and QB-27571 for updates.
QB-26321 and QB-27571
QB-26321 and QB-27571