Sys Maintenance: Troubleshooting Health & Wellness

Document created by RSA Information Design and Development on Mar 22, 2017Last modified by RSA Information Design and Development on Aug 1, 2017
Version 8Show Document
  • View in full screen mode
  

Issues Common to All Hosts and Services 

You may see the wrong statistics in the Health & Wellness interface if:

  • Some or all the hosts and services are not provisioned and enabled correctly.
  • You have a mixed-version deployment (that is, hosts updated to different Security Analytics versions).
  •  Supporting services are not running.

Issues Identified by Messages in the Interface or Log Files

This section provides troubleshooting information for issues identified by messages Security Analytics displays in the Health & Wellness Interface or includes in the Health & Wellness log files. 

                 
Message

User Interface:  Cannot connect to System Management Service
System Management Service (SMS) logs:
Caught an exception during connection recovery!
java.io.IOException
at com.rabbitmq.client.impl.AMQChannel.wrap(AMQChannel.java:106)
at com.rabbitmq.client.impl.AMQChannel.wrap(AMQChannel.java:102)
at com.rabbitmq.client.impl.AMQConnection.start(AMQConnection.java:346)
at com.rabbitmq.client.impl.recovery.
RecoveryAwareAMQConnectionFactory.
newConnection(RecoveryAwareAMQConnectionFactory.java:36)
at com.rabbitmq.client.impl.recovery.AutorecoveringConnection.
recoverConnection(AutorecoveringConnection.java:388)
at com.rabbitmq.client.impl.recovery.AutorecoveringConnection.
beginAutomaticRecovery(AutorecoveringConnection.java:360)
at com.rabbitmq.client.impl.recovery.AutorecoveringConnection.
access$000(AutorecoveringConnection.java:48)
at com.rabbitmq.client.impl.recovery.AutorecoveringConnection$1.
shutdownCompleted(AutorecoveringConnection.java:345)
at com.rabbitmq.client.impl.ShutdownNotifierComponent.
notifyListeners(ShutdownNotifierComponent.java:75)
at com.rabbitmq.client.impl.AMQConnection$MainLoop.run(AMQConnection.java:572)
at java.lang.Thread.run(Thread.java:745)
Caused by: com.rabbitmq.client.ShutdownSignalException: connection error
at com.rabbitmq.utility.ValueOrException.getValue(ValueOrException.java:67)
at com.rabbitmq.utility.BlockingValueOrException.
uninterruptibleGetValue(BlockingValueOrException.java:33)
at com.rabbitmq.client.impl.AMQChannel$BlockingRpcContinuation.
getReply(AMQChannel.java:343)
at com.rabbitmq.client.impl.AMQConnection.start(AMQConnection.java:292)
... 8 more
Caused by: java.net.SocketException: Connection reset
at java.net.SocketInputStream.read(SocketInputStream.java:189)
at java.net.SocketInputStream.read(SocketInputStream.java:121)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
at java.io.BufferedInputStream.read(BufferedInputStream.java:265)
at java.io.DataInputStream.readUnsignedByte(DataInputStream.java:288)
at com.rabbitmq.client.impl.Frame.readFrom(Frame.java:95)
at com.rabbitmq.client.impl.SocketFrameHandler.readFrame(SocketFrameHandler.java:139)
at com.rabbitmq.client.impl.AMQConnection$MainLoop.run(AMQConnection.java:532)

Possible Cause RabbitMQ service not running on the Security Analytics host. 
Solution

Restart RabbitMQ service using the following commands.
service rabbitmq-server restart

 

               
Message/
Problem
User Interface: Cannot connect to System Management Service
Cause The System Management Service, RabbitMQ, or Tokumx service is not running.  
Solution Run the following commands on Security Analytics server to make sure all these services are running.
[root@saserver ~]# service rsa-sms status
RSA NetWitness SMS :: Server is not running.
[root@saserver ~]# service rsa-sms start
Starting RSA NetWitness SMS :: Server...
[root@saserver ~]# service rsa-sms status
RSA NetWitness SMS :: Server is running (5687).
[root@saserver ~]# service tokumx status
tokumx (pid  2779) is running...
 service rabbitmq-server status
Status of node sa@localhost ...
[{pid,2501},
 {running_applications,
     [{rabbitmq_federation_management,"RabbitMQ Federation Management",
          "3.3.4"},

 

               
Message/
Problem
User Interface: Cannot connect to System Management Service
Possible Cause /var/lib/rabbitmq partition usage is 70% or greater. 
Solution Contact Customer Care.

 

               
Message/
Problem
User Interface: Host migration failed.
Possible Cause One or more Security Analytics services may be in a stopped state.
Solution Make sure that the following services are running then restart the Security Analytics server:
Archiver, Broker, Concentrator, Decoder, Event Stream Analysis, Incident management, IPDB Extractor, Log Collector, Log Decoder, Malware Analysis, Reporting Engine, Warehouse Connector, Workbench.

 

               
Message/
Problem
User Interface: Server Unavailable.
Possible Cause One or more Security Analytics services may be in a stopped state.
Solution Make sure that the following services are running then restart the Security Analytics server:  Archiver, Broker, Concentrator, Decoder, Event Stream Analysis, Incident management, IPDB Extractor, Log Collector, Log Decoder, Malware Analysis, Reporting Engine, Warehouse Connector, Workbench.

 

                         
Message/
Problem
User Interface: Server Unavailable
Possible Cause System Management Service (SMS), RabbitMQ, or Tokumx service is not running. 
Solution 1 Run the following commands on Security Analytics server to make sure all these services are running.
[root@saserver ~]# service rsa-sms status
RSA NetWitness SMS :: Server is not running.
[root@saserver ~]# service rsa-sms start
Starting RSA NetWitness SMS :: Server...
[root@saserver ~]# service rsa-sms status
RSA NetWitness SMS :: Server is running (5687).
[root@saserver ~]# service tokumx status
tokumx (pid  2779) is running...
 service rabbitmq-server status
Status of node sa@localhost ...
[{pid,2501},
 {running_applications,
     [{rabbitmq_federation_management,"RabbitMQ Federation Management",
          "3.3.4"},
Solution 2 Make sure /var/lib/rabbitmq partition is less than 75% full
Solution 3 Check Security Analytics host log files (var/lib/netwitness/uax/logs/sa.log) for any errors.

Issues Not Identified by the User Interface or Logs

This section provides troubleshooting information for issues that are not identified by messages Security Analytics displays in the Health & Wellness Interface or includes in the Health & Wellness log files.  For example, you may see incorrect statitical information in the Interface. 

               
Problem Incorrect statistics displayed in Health and Wellness interface.
Possible Cause Puppet service not running. Puppet service must be running on all services.
Solution Restart Puppet service.

 

               
Problem Incorrect statistics displayed in Health and Wellness interface.
Possible Cause SMS service is not running. SMS service must be running on the Security Analytics host.
Solution Restart SMS service.

 

               
Problem Security Analytics does not show version to which you upgraded until you restart jettysrv  (jeTTy server). 
Possible Cause When Security Analytics checks a connection, it polls a service every 30 seconds to see if it is active. During that 30 seconds, if the service comes back up, it will not get the new version.
Solution
  1. Manually stop the service.
  2. Wait until you see that it is it offline.
  3. Restart the service.
    Security Analytics displays the correct version.

 

               
Problem Security Analytics server does not display Service Unavailable page.
Possible Cause After you upgrade to Security Analytics version 10.5, JDK 1.8 is not default version and this causes the jettysrv (jeTTy server) to fail to start. Without the jeTTy server, the Security Analytics server cannot display the Service Unavailable page. 
Solution Restart jeTTy server.
You are here
Table of Contents > Monitor Health and Wellness of Security Analytics > Troubleshooting Health & Wellness

Attachments

    Outcomes