- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Token disposed error in Qlik .NET SDK
Hello everyone
I'm using the Qlik Sense .NET SDK to interact with a Qlik SaaS app and, while doing it, I sometimes get "CancellationTokenSource has been disposed" errors like this:
System.ObjectDisposedException: The CancellationTokenSource has been disposed.
at System.Threading.CancellationTokenSource.ThrowObjectDisposedException()
at System.Threading.CancellationTokenSource.get_Token()
at Qlik.Sense.JsonRpc.RpcConnection.get_CancellationToken()
at Qlik.Engine.Communication.QlikConnection.Qlik.Engine.Communication.IQlikConnection.get_CancellationToken()
at Qlik.Engine.App.BackCount()
In this case I was invoking the IApp.BackCount() method, but this can happen even when calling other methods.
Does anyone know what this error means exactly? Is it a disconnection? Why is the CancellationTokenSource getting randomly disposed in the middle of execution, and how can I diagnose the cause of this?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I'm not sure why you end up with that "CancellationTokenSource disposed" error but I would guess it's due to the connection to QCS being dropped for some reason. There is a tendency for that to happen when working with SaaS solutions, more so than on-prem solutions. Unfortunately the .NET SDK does not today provide a good way to handle this transparently and reestablish the connection under the hood.
What you could to is to observe the "ConnectionClosing" event in the session. Like this:
app.Session.ConnectionClosingEvent += SessionOnConnectionClosingEvent;
That could at least give you a fair idea if this is really what is happening. You might also activate logging for the SDK and see if you see any traces of what is happening there.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Øystein_Kolsrud, thanks for your support.
Indeed, I have noticed relatively frequent timeouts/disconnections when using the SDK with QS SaaS. In addition to this one, I also tend to sometime get this one when connecting to an app on QCS:
System.TimeoutException: Method "QTProduct" timed out
at Qlik.Engine.Communication.QlikConnection.AwaitResponseTask[T](T task, String methodName, CancellationToken cancellationToken)
at Qlik.Engine.Communication.QlikConnection.AwaitResponse(Task task, String methodName, CancellationToken cancellationToken)
at Qlik.Engine.Communication.QlikConnection.Ping()
at Qlik.Sense.JsonRpc.GenericLocation.DisposeOnError(IDisposable o, Action f)
at Qlik.Engine.QcsLocation.Hub(String appId, String sessionToken, Dictionary`2 wsParameters)
at Qlik.Engine.QcsLocation.App(String appId, SessionToken sessionToken, Boolean noData)
Can I ask if this is this something that is known to the dev team and is being worked on? And are there any steps we can adopt on our end to mitigate the issue?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Unfortunately I believe these problems are due to transient issues with the network connections to the severs hosted in the cloud, so not something that can easily be remedied. The only solution I can come up with is to implement retry mechanisms for whatever operations you are doing. You could of course play around with extending timeouts and such, but that is unlikely to really solve your problem.
Making the SDK more robust in this area has certainly been considered. It's not a trivial thing to do though and would require a couple of API changes that would break backwards compatibility so we have been hesitant to do that.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
As a developer I understand completely... as a Qlik Partner however, I also have to justify to my customers why our applications that interact with QS are unreliable, which in turn means telling them that Qlik Cloud is unreliable... as you can imagine, this is not a very good look.
Does Qlik provide any official statements and/or metrics on the reliability of APIs and the QCS itself?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Here's another error that I sometimes get randomly, again a transient disconnection:
System.Net.WebSockets.WebSocketException (0x80004005): The remote party closed the WebSocket connection without completing the close handshake.
at Qlik.Engine.Communication.QlikConnection.AwaitResponseTask[T](T task, String methodName, CancellationToken cancellationToken)
at Qlik.Engine.Communication.QlikConnection.AwaitResponse[T](Task`1 task, String methodName, CancellationToken cancellationToken)
at Qlik.Engine.App.Evaluate(String expression)
It really makes working with the Qlik Cloud API almost impossible, especially since it can happen when invoking any method that makes a request towards the Qlik engine (in this case it was the Evaluate() method).
A retry mechanism is fine if the error happens when you first connect, but it can become quite complicated if (like this case) it happens in the middle of a session. I assume, for example, that the app session gets destroyed when this happens and thus any selections or soft patches that were previously applied are lost... makes a lot of cases really difficult to handle.
It would be great to have an official response from Qlik on their policies for this kind of issues. I think I'll open a support case in this regard.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Just a follow up on this one: The websocket disconnects are not necessarily due to errors in the connections. It is of course possible that a connection is terminated due to loss of network connection or a node crashing due to out-of-memory conditions, but the typical scenario is that a connection is terminated due to node scaling activity in the cloud solution. This is an artifact of the dynamic nature of cloud based solutions and makes it possible for the system to load balance activity across nodes. An example is a case where a node is running an app that is quite large and that consumes a lot of computation cache memory. Another app is on the same node and competing for the same resources. The system can in such cases choose to move an app from one node to another, preferably one with more available memory. There are mechanisms in place that allows session to be transferred between nodes, and if you're working in the Qlik Sense client, then you hardly notice that this takes place. You might get a message saying that your are reconnecting, but once that has completed you can continue working as if nothing happened.
This mechanism does complicate things when you are working through the APIs though, because your websocket will be closed and a new one needs to be opened. But if the session has been moved to a different node, then you will get a "SESSION_ATTACHED" message when connecting to the app, and all your selections and everything will still be in place. So it's not correct to assume that the app session has been destroyed. It could very well have been moved to a different node.
However, the big problem from an API point of view is that there is state associated with the connection, not just the session. In particular, when you call and endpoint you need to identify the object for which you are performing an operation by using it's handle. This handle is part of the connection state; you get it for instance when you call "App.GetGenericObject". So when you reconnect, even to an existing session, you need to get new handles for all objects you are working with.
Another case where sessions get moved to different nodes is when a new engine version is being deployed across the system. The system will then empty engine for session before shutting them down and deploying the new version. Again, you typically don't notice this when working in the client, because all you get is a short "reconnecting" message, but it's up to the client to decide how to handle such reconnects. For the web client, there's always the "trivial" fallback solution to just refresh the page. That might not be so simple in your API based solution, but that all depends on what you're doing through the APIs.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Another thing: Your session will not be moved if you are using API keys to log in. You'll typically need to use an OAuth connection to benefit from this feature. If you're interested in exploring this, then it might be of use to you to know that I recently created this library that makes it possible to connect to QCS using OAuth:
https://www.nuget.org/packages/Qlik.OAuthManager
The ReadMe file of the project describes how it can be used:
https://github.com/kolsrud/qlik-oauth-manager/blob/main/README.md
There are also a couple of examples available here:
https://github.com/kolsrud/qlik-oauth-manager/tree/main/src/Qlik.OAuthManager/Examples
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks @Øystein_Kolsrud for the technical clarification.
My app uses a dedicated JWT IdP provider to authenticate using a certificate. Does this have implications when it comes to scaling/switching?
But if the session has been moved to a different node, then you will get a "SESSION_ATTACHED" message
Interesting... are there any examples to show how this should be handled when working with the SDK?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I'm not really an expert on this, so I'm not 100% sure, but I have only been able to get such session reconnects after sessions have been moved when I use OAuth. I've not seen that behavior when using JWT authentication.