We have QS Enterprise 2019 June version installed to our production. In this environment, we have 1 independent postgreSQL server, 1 central, 1 proxy,2 consumer nodes, 2 reload nodes.
This morning at 1:11:38, our postgreSQL decided to shutdown itself. (not the server, but only postgres). At 1:12:30, postgres is back.
But at 1:11:39, central node send request to postgres, and got error: "Unable to read data from the transport connection: An existing connection was forcibly closed by the remote host. ---> System.IO.IOException: Unable to read data from the transport connection: An existing connection was forcibly closed by the remote host. ---> System.Net.Sockets.SocketException: An existing connection was forcibly closed by the remote host↵↓ at System.Net.Sockets.NetworkStream.Read(Byte buffer, Int32 offset, Int32 size)↵↓ --- End of inner exception stack trace ---↵↓ at Devart.Common.aa.a(Byte A_0, Int32 A_1, Int32 A_2)↵↓ at Devart.Common.g.c(Byte A_0, Int32 A_1, Int32 A_2)↵↓ at Devart.Common.o.d(Byte A_0, Int32 A_1, Int32 A_2)↵↓ at Devart.Common.al.d(Exception A_0)↵↓ at Devart.Common.o.d(Byte A_0, Int32 A_1, Int32 A_2)↵↓ at Devart.Data.PostgreSql.v.a(Boolean A_0, Boolean A_1, Char A_2, Boolean A_3)↵↓ at Devart.Data.PostgreSql.f.a()' 50d8c47f-e291-4ddc-b61e-5c71acf82864"
central node tried altogether 7 times. The last try is at 1:20:20.
After receiving error message "No connection could be made because the target machine actively refused it" at 1:20:20, it decided to terminate background work executor. Central node recorded "terminating background work executor" and "Executor stopped" about every half hours during 1:12:36am until 6.55.02am. The termination resulted central node cannot get extension informantion from \\shared drive\StaticContent\Extensions\. Central node complains "Invalid extension schema".
We found in the morning at 8.14am that no chart renders in mashups, cannot access apps in hub, QMC shows only black background and big running circle.
At 9:05am, somehow mysteriously everything starts to work. I found from system_repository log that System.Repository.Repository.Core.Installer.InstallerConfigurationHandler wrote: " Installer configuration successfully read" After that there is no complaining about "Invalid extension schema".
My question is: why central node stopped trying to contact postgres server after 1:12:20?
I also need advice on how to find out what fixed the issue at 9:05am. The only thing I did was trying to access QMC from different proxy service of different nodes. We have proxy service in ceontral node, proxy node and 1 consumer node. Colleagues said they haven't done anything, except rebooted QS shared drive at around 8:46.