It sounds quite strange. The server will be unresponsive by about 70% of the RAM consumption although the high-setting from the working set is set to 80% (I think you could increase this value to maybe 95% by your amount of RAM).
Because of the reason that your system-freezing happens so fast it should be possible to get hints by directly observing the matter within the task-manager and the qmc. Beside this and your efforts with the governance dashboard the enabling of the extensive audit-logging might give further ideas what happens.
The quite fast consume of RAM after a services-restart could be caused from some pre-loadings of the applications and if your size-details are from compressed qvw's the real RAM size could be a lot bigger.
Nevertheless and although I haven't own experience with cluster-installations I think that there is anywhere a misconfiguration between the various services (have the qmc, directory services and the web-server(s) also own machines?) and the load-balancing between the nodes.