We have spent over two weeks trying to configure a cluster using QV10 SR2 (yes I know SR3 is now available). We have two cluster nodes, with the shared root node located on a Windows network share. This Windows server uses a replicated SAN for backing store. The cluster nodes are geographically distributed (for disaster recovery purposes).
This configuration has been very unstable, with the QVS services restarting randomly and QDS (publisher) failing to reload.
The most prominent QVS errors have been "PGO: Failed to open" errors and "Warning File: Shared File Disappeared: \\<some_server>\<some_file>.qvw.Meta.". The latter error is most disruptive, as it is always followed by a server restart with this error: "Error Restart: Server aborted trying to recover by restart. Reason for restart: Internal inconsistency, general exception detected."
Recently, QV support has suggested that the problem might be related to the geographically distributed QVS servers. They suggested that we relocate the root folder to a drive local to one of the two cluster nodes. We did this, and it reduced the number of errors, but did not eliminate them entirely.
Note that our actual QVWs are also located on the aforementioned Windows share, albeit in different directories. These folders were not moved to the local drive.
All errors also cease when we disable the QVS on one of the cluster nodes, and the errors return when the second QVS service is re-enabled.
We are using IPStor, not Veritas, and the LUNs are actually served through a Windows 2003 server, so they are presented as actual windows shares; i.e. no clients access the SAN directly, they all go through the Windows 2003 server.
I'm not sure what you mean by hosted on the windows clustered OS. The official response from QV is that clustered QVS servers cannot be geographically dispersed, and there is no plan to support this in the foreseeable future.
It is not clear how this topological distance is measured (our SAN storage is fairly low latency, on the order of <50ms on average, regardless of client location on the network), nor is it clear whether the problem-causing distance is the distance between QVS servers, or the distance between the QVS server(s) and the Windows server sharing the common root folder.
It is the shared root folder that is causing the problem, according to QV. We have even tried putting the shared root folder on the local C: drive of one of the geographically-dispersed QVS servers, but this does not resolve the problem.
Do you have found a sollution to the issue about pgo file errors and shared file desappearing ? I have a simularly issue in my plaform built as cluster composed of 2 qvs nodes.
if you have find recomandations about shared storage configuration for qv cluster, please share it.
thank you a lot,