We were running two tasks one after the other, and what they do is taking tables from SSMS and create qvd files. This takes around 8 min for each task if run in line.
We wanted to run those tasks in parallel to save time, but after doing so those tasks need at least 30 minutes. Sometimes do work in parallel (rare occasion though), and some times we occur the "false" running situation that I have read some members wrote here.
I have checked our recourses, and while those tasks were running, we were using just the 11% of our RAM and 5-10% of our CPU. I have noticed that the download speed is steady on 100Mbps, while if those two tasks are running in line, the download speed can reach at least at 300Mbps.
I have also checked that we set up to 5 tasks to run at the same time and yesterday we rebooted the QlikView Server.
Thank you so much!
From the Qlik-side tables with these sizes are not a problem - especially by your mentioned QlikView environment and the observed CPU/RAM consumption. But by the SQL server I'm not so sure.
Important is to know that QlikView (like the most other tools) no SQL executes else it just transferred the statement per driver to the databases and received on this way back the results. This means everything happens within SQL server environment - and a query on a raw-table of 17 GB with some transformations and some caching/buffering may in the end easily consume 50 GB - depending on settings maybe completely in RAM. Is this amount always available?
Beside this I suspect more your network as limitation because your sql message might just mean that the database mustn't buffer any data if the network is (too) slow. Nowadays nearly everything is virtualized and therefore I suggest a monitoring of the VLAN + (V-)proxies + load-balancer.
Another suggestion is to implement (more) incremental approaches and/or diving both tasks into smaller chunks.
I agree with @marcus_sommer in the sense it's unlikely to be on the QlikView side.
One thing I overlooked (because we rarely get this deep into a problem) was the NUMA settings. Not sure if you can disable it.
And, since we are at the hardware level now, I wouldnt be against you standing up another QDS Vm and transition the jobs there to see if it makes any difference.
Another thing that shouldn't be affecting you is the heap setting o the registry. But since we're kinda running our of ideas, here you go:
If you so the change, let us know how it affected your environment.
I feel like I'm one answer away from offering a you a Teams support call, and then redirecting you to your Qlik Rep for further assistance if we can't figure it out ourselves.
Were really all network-parts be checked - means also the proxies and any load-balancers? About two years ago we had had a case with accessing external data with an irregular performance and often also timeouts and the IT swore that nothing has been changed and the errors must be on our side.
After a while we got them ready to a few live monitoring-sessions with a running WireShark to monitor each detail of the traffic and the cause in the end was a new virtualized load-balancer cluster which distributes the requests to the proxy-clusters - and not all of them were properly configured. Maybe that is also a possible way for your issue.
@marcus_sommer - that's a great suggestion too. My issue with going to the Networking side is that it's quite difficult to provide evidence to the Networking team that the ball is indeed on their court.
But, @BillZrv , if you are allowed to have WireShark on your server, and assuming it's not going to interfere too much with your tasks' resources , then this might route's probably better than my previous suggestion (since mine required the setup of a whole other box - and the problem might be not related to Qlik or the computer it sits on).
Running tasks in parallel must not mandatory result in a smaller run-time even if it looked that there are plenty of available resources. In your case I could imagine that the SSMS caused the delay because the parallel accesses are queued for any reason (performance, priorities, locking ...).
Hi @BillZrv ,
I'm not sure what queries you are running on the background, but it does seem to me you're running into some sort of resource contention (tables deadl-locking, or multiple tasks trying to write the same file at the same times, etc).
My suggestion, if not done so yet, would be to ensure your SSIS packages are not writing anywhere; that you're not using temp global tables at the same time, that you're not locking tables when reading (usage of "no lock" on the query) and ensuring the QVDs each task works with are different.
Please let me us know if contention was accounted for, and we can move on to other things.
thank for the support!
I have checked that the actions we run in parallel are NOT connected to each other, or trying to extract data from the same tables on the SQL server.
I have tried to run them in line, and still, some times, experience the same problem. The truth is that we get this slow fetching data problem completely at random, either running the tasks in parallel or in line.
We rebooted the QlikView server two or three times to see if it might fix it, but unfortunately no. In the past, multiple tasks were failing for various reasons, so I am not sure if this error is new, or we just figured out now since we fixed the other ones and made things more clear.
Till now, the only clue we have remains the same, when those two tasks are running (regardless in line or parallel) and we see on the task manager of the QlikView server a steady download speed of 100Mbps, is 100% sure that those tasks will need triple the time in order to complete (when the speed is 300 or higher then we are good to go).
So it looked that not QlikView and not the SQL server are responsible for the delaying. This would mean that anything within your environment may causing the issue, like the (virtual) machines or the (virtual) networks and/or the belonging load balancer aren't working stable. Monitoring their behaviour or looking in the various log-files may give hints what doesn't work like expected.
I checked now on to SSMS and we see:
If anyone has a clue at what is going on, or what else should I check, please let me know!
What's behind the feature is described here: ASYNC_NETWORK_IO - SQL Shack - articles about database auditing, server performance, data recovery, .... I don't know why it might be suspended - maybe because any thresholds in regard to storage/RAM/CPU are exceeded ...
As far as I understand, the size of the tables that I am trying to load from the SQL server might be the problem.
The first task is retrieving data from a table which size is around 17GB and create a qvd with a size of 2.8GB.
The second task that I set to run in parallel, retrieves multiple tables and views whose size are approximately 5-6GB, and they create multiple qvds that their combined size is 1,25GB.
If this is the problem, it creates a lot of other questions such as:
why sometimes is working properly?
why there is no difference running it in parallel or in line?
I know those questions might not be answered, I am just leaving them here in case someone finds a better solution for them in the future.
Thank you Marcus!
My problem seems to look alike this one:
Our System Settings:
16x Intel Xeon CPU E5-2630 v3 @2.40GHz
256 GB RAM
Windows Server 2012 R2
Again, if anyone has an idea of what might be the problem, would mean a lot!