4 Replies Latest reply: Apr 4, 2012 4:03 PM by Brent Nichol RSS

    Unexplained Server Restart

    Brent Nichol

      Is there a log to identify why a QVS would restart on its own?

       

      I'm able to identify that a restart occurred in the Performance and Events log, but there isn't any indication why the restart occurred.  Is there another way to identify the reason for this activity?

       

      Here are outakes from the logs...

      Events.log

      2012-04-03 02:36:00    2012-04-03 14:42:37    4    504    Information    Ticket Lookup: Ticket FB1D8F2B3B0C832E1BD645DEE7DA292FD17DA382 was found.

      2012-04-03 14:52:00    2012-04-03 14:52:01    4    504    Information    WorkingSet: Configured Working Set Size is 28.799-31.359 GB (setting 90%-98%)

      2012-04-03 14:52:00    2012-04-03 14:52:01    4    500    Information    Debug: Calling StartServiceCtrlDispatcher()

      2012-04-03 14:52:00    2012-04-03 14:52:01    4    500    Information    Debug: Entering CNTService::ServiceMain()

      2012-04-03 14:52:00    2012-04-03 14:52:01    4    500    Information    Debug: CNTService::SetStatus(3147728, 2)

      2012-04-03 14:52:00    2012-04-03 14:52:01    4    500    Information    Debug: CNTService::SetStatus(3147728, 4)

      2012-04-03 14:52:00    2012-04-03 14:52:01    4    500    Information    Debug: 8 CPU cores found in 2 CPUs

      2012-04-03 14:52:00    2012-04-03 14:52:01    4    301    Information    Found Mount  at location \\Prod  browsable

      2012-04-03 14:52:00    2012-04-03 14:52:02    4    504    Information    Document PreLoad: Start

      2012-04-03 14:52:00    2012-04-03 14:52:03    4    504    Information    DOC loading: Beginning load of document \\PROD\Document.QVW.

       

      Performance.log

      RLS64    9.00.7773.0409.10    2012-04-03 02:36:00    2012-04-03 14:40:00    Normal    17    108    0    0    0    0    5    13    28    68    4    0    4932    77    16    85    16    84    4    24476    27106    8361502    8352910    -1.00

      RLS64    9.00.7773.0409.10    2012-04-03 14:52:00    2012-04-03 14:52:01    Server starting    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    127    192    8388416    8379084    -1.00

      RLS64    9.00.7773.0409.10    2012-04-03 14:52:00    2012-04-03 14:55:00    Normal    10    11    0    0    0    0    14    6    7    8    7    0    5154    73    10    11    10    11    11    4917    5025    8383583    8379084    -1.00

       

       

      Any suggestions would be appreciated,

      B

        • Unexplained Server Restart
          Gary Strader

          Those are the two logs that have the information you need.  You could try setting the log level to high, but it looks like you already have that set.  You could also try to correlate events in the Windows event logs.  This application is good for that: http://community.qlik.com/qlikviews/1029

           

          You working set size is 90 - 98%.  Why so high?  Default is 70 - 90% I think.  Based on the perf log it looks like you're getting close to that RAM limit.  This shouldn't cause a server crash, but sometimes it does.

           

          Did you really have no server traffic between 02:36:00 and 14:52:00?  Or are those just snippets from the logs?  Sometimes when the server crashes, it stops logging to the log file.

           

          I would call QlikTech Support.

            • Unexplained Server Restart
              Brent Nichol

              The times 02:36:00 and 14:52:00 are the server start times, not the transaction times.  There was the usual amount of activity between these times.

               

              The working set is configured to reduce the amount of unnecessary Virtual Memory logging.  The server usually runs about 80%-90%, and we only want warnings when those values are exceeded.

               

              The event viewer had the following, but it is not helpful...

              The QlikView Server service terminated unexpectedly. It has done this 1 time(s). The following corrective action will be taken in 10 milliseconds: Restart the service.

               

               

              Any other suggestions?