Skip to main content
Announcements
Global Transformation Awards submissions are open! SUBMIT YOUR STORY
cancel
Showing results for 
Search instead for 
Did you mean: 
BillZrv
Contributor II

Two parallel tasks in QMC run forever

Hello everyone,

We were running two tasks one after the other, and what they do is taking tables from SSMS and create qvd files. This takes around 8 min for each task if run in line.

We wanted to run those tasks in parallel to save time, but after doing so those tasks need at least 30 minutes. Sometimes do work in parallel (rare occasion though), and some times we occur the "false" running situation that I have read some members wrote here.

I have checked our recourses, and while those tasks were running, we were using just the 11% of our RAM and 5-10% of our CPU.  I have noticed that the download speed is steady on 100Mbps, while if those two tasks are running in line, the download speed can reach at least at 300Mbps.

I have also checked that we set up to 5 tasks to run at the same time and yesterday we rebooted the QlikView Server. 

Any ideas? 

Thank you so much!

 

 

Labels (3)
4 Solutions

Accepted Solutions
marcus_sommer

From the Qlik-side tables with these sizes are not a problem - especially by your mentioned QlikView environment and the observed CPU/RAM consumption. But by the SQL server I'm not so sure.

Important is to know that QlikView (like the most other tools) no SQL executes else it just transferred the statement per driver to the databases and received on this way back the results. This means everything happens within SQL server environment - and a query on a raw-table of 17 GB with some transformations and some caching/buffering may in the end easily consume 50 GB - depending on settings maybe completely in RAM. Is this amount always available?

Beside this I suspect more your network as limitation because your sql message might just mean that the database mustn't buffer any data if the network is (too) slow. Nowadays nearly everything is virtualized and therefore I suggest a monitoring of the VLAN + (V-)proxies + load-balancer.

Another suggestion is to implement (more) incremental approaches and/or diving both tasks into smaller chunks. 

- Marcus

View solution in original post

Dan-Kenobi
Partner - Contributor III

I agree with @marcus_sommer in the sense it's unlikely to be on the QlikView side. 

One thing I overlooked (because we rarely get this deep into a problem) was the NUMA settings. Not sure if you can disable it.

And, since we are at the hardware level now, I wouldnt be against you standing up another QDS Vm and transition the jobs there to see if it makes any difference.

Another thing that shouldn't be affecting you is the heap setting o  the registry. But since we're kinda running our of ideas, here you go:

https://community.qlik.com/t5/QlikView-Administration/QlikView-Server-Memory-Heap/td-p/933720

 

If you so the change, let us know how it affected your environment. 

I feel like I'm one answer away from offering a you a Teams support call, and then redirecting you to your Qlik Rep for further assistance if we can't figure it out ourselves.

View solution in original post

marcus_sommer

Were really all network-parts be checked - means also the proxies and any load-balancers? About two years ago we had had a case with accessing external data with an irregular performance and often also timeouts and the IT swore that nothing has been changed and the errors must be on our side.

After a while we got them ready to a few live monitoring-sessions with a running WireShark to monitor each detail of the traffic and the cause in the end was a new virtualized load-balancer cluster which distributes the requests to the proxy-clusters - and not all of them were properly configured. Maybe that is also a possible way for your issue.

- Marcus

View solution in original post

Dan-Kenobi
Partner - Contributor III

@marcus_sommer - that's a great suggestion too. My issue with going to the Networking side is that it's quite difficult to provide evidence to the Networking team that the ball is indeed on their court.

But, @BillZrv , if you are allowed to have WireShark on your server, and assuming it's not going to interfere too much with your tasks' resources , then this might route's probably better than my previous suggestion (since mine required the setup of a whole other box - and the problem might be not related to Qlik or the computer it sits on).

View solution in original post

15 Replies
marcus_sommer

Running tasks in parallel must not mandatory result in a smaller run-time even if it looked that there are plenty of available resources. In your case I could imagine that the SSMS caused the delay because the parallel accesses are queued for any reason (performance, priorities, locking ...).

- Marcus

Dan-Kenobi
Partner - Contributor III

Hi @BillZrv ,

        I'm not sure what queries you are running on the background, but it does seem to me you're running into some sort of resource contention (tables deadl-locking, or multiple tasks trying to write the same file at the same times, etc).

       My suggestion, if not done so yet, would be to ensure your SSIS packages are not writing anywhere; that you're not using temp global tables at the same time, that you're not locking tables when reading (usage of "no lock" on the query) and ensuring the QVDs each task works with are different.

        Please let me us know if contention was accounted for, and we can move on to other things.

Thank you

BillZrv
Contributor II
Author

Hello Dan!

thank for the support!

I have checked that the actions we run in parallel are NOT connected to each other, or trying to extract data from the same tables on the SQL server.

I have tried to run them in line, and still, some times, experience the same problem.  The truth is that we get this slow fetching data problem completely at random, either running the tasks in parallel or in line. 

We rebooted the QlikView server two or three times to see if it might fix it, but unfortunately no. In the past, multiple tasks were failing for various reasons, so I am not sure if this error is new, or we just figured out now since we fixed the other ones and made things more clear. 

Till now, the only clue we have remains the same, when those two tasks are running (regardless in line or parallel) and we see on the task manager of the QlikView server a steady download speed of 100Mbps, is 100% sure that those tasks will need triple the time in order to complete (when the speed is 300 or higher then we are good to go).

marcus_sommer

So it looked that not QlikView and not the SQL server are responsible for the delaying. This would mean that anything within your environment may causing the issue, like the (virtual) machines or the (virtual) networks and/or the belonging load balancer aren't working stable. Monitoring their behaviour or looking in the various log-files may give hints what doesn't work like expected.

- Marcus  

BillZrv
Contributor II
Author

I checked now on to SSMS and we see:

Async_Network_io suspended.

If anyone has a clue at what is going on, or what else should I check, please let me know!

marcus_sommer

What's behind the feature is described here: ASYNC_NETWORK_IO - SQL Shack - articles about database auditing, server performance, data recovery, .... I don't know why it might be suspended - maybe because any thresholds in regard to storage/RAM/CPU are exceeded ...

- Marcus

BillZrv
Contributor II
Author

As far as I understand, the size of the tables that I am trying to load from the SQL server might be the problem.

The first task is retrieving data from a table which size is around 17GB and create a qvd with a size of 2.8GB.

The second task that I set to run in parallel, retrieves multiple tables and views whose size are approximately 5-6GB, and they create multiple qvds that their combined size is 1,25GB.

If this is the problem, it creates a lot of other questions such as:

why sometimes is working properly?

why there is no difference running it in parallel or in line?

I know those questions might not be answered, I am just leaving them here in case someone finds a better solution for them in the future.  

Thank you Marcus!

BillZrv
Contributor II
Author

Hello,

My problem seems to look alike this one:

https://community.qlik.com/t5/QlikView-Administration/Tasks-running-slow-at-random-times/td-p/860104

Our System Settings:

QlikView 12.60.20200.0

16x Intel Xeon CPU E5-2630 v3 @2.40GHz

256 GB RAM

Windows Server 2012 R2

Again, if anyone has an idea of what might be the problem, would mean a lot!

Thank you!

marcus_sommer

I don't think that there is nowadays a major-problem with the distribution of threads between the cores like in the old days of the first opteron server - nevertheless you might investigate which BIOS settings are there (things like hyper-threading, NUMA and similar). Especially if it's running within a virtual machine - how is the host configured?
 
Further I doubt that the issue is related to QlikView and also that it's less likely that the server-resources from the QlikView machine are the cause.
 
- Marcus