Sys Maintenance: Troubleshooting Health & Wellness

Document created by RSA Information Design and Development on Sep 14, 2017Last modified by RSA Information Design and Development on Oct 13, 2017
Version 10Show Document
  • View in full screen mode
  

Issues Common to All Hosts and Services 

You may see the wrong statistics in the Health & Wellness interface if:

  • Some or all the hosts and services are not provisioned and enabled correctly.
  • You have a mixed-version deployment (that is, hosts updated to different NetWitness Suite versions).
  •  Supporting services are not running.

Issues Identified by Messages in the Interface or Log Files

This section provides troubleshooting information for issues identified by messages NetWitness Suite displays in the Health & Wellness Interface or includes in the Health & Wellness log files.

                 
MessageUser Interface:  Cannot connect to System Management Service
System Management Service (SMS) logs:

Caught an exception during connection recovery!
java.io.IOException
at com.rabbitmq.client.impl.AMQChannel.wrap(AMQChannel.java:106)
at com.rabbitmq.client.impl.AMQChannel.wrap(AMQChannel.java:102)
at com.rabbitmq.client.impl.AMQConnection.start(AMQConnection.java:346)
at com.rabbitmq.client.impl.recovery.RecoveryAwareAMQConnectionFactory.
newConnection(RecoveryAwareAMQConnectionFactory.java:36)
at com.rabbitmq.client.impl.recovery.AutorecoveringConnection.
recoverConnection(AutorecoveringConnection.java:388)
at com.rabbitmq.client.impl.recovery.AutorecoveringConnection.
beginAutomaticRecovery(AutorecoveringConnection.java:360)
at com.rabbitmq.client.impl.recovery.AutorecoveringConnection.access$000(AutorecoveringConnection.java:48)
at com.rabbitmq.client.impl.recovery.AutorecoveringConnection$1.
shutdownCompleted(AutorecoveringConnection.java:345)
at com.rabbitmq.client.impl.ShutdownNotifierComponent.notifyListeners(ShutdownNotifierComponent.java:75)
at com.rabbitmq.client.impl.AMQConnection$MainLoop.run(AMQConnection.java:572)
at java.lang.Thread.run(Thread.java:745)
Caused by: com.rabbitmq.client.ShutdownSignalException: connection error
at com.rabbitmq.utility.ValueOrException.getValue(ValueOrException.java:67)
at com.rabbitmq.utility.BlockingValueOrException.uninterruptibleGetValue(BlockingValueOrException.java:33)
at com.rabbitmq.client.impl.AMQChannel$BlockingRpcContinuation.getReply
(AMQChannel.java:343)
at com.rabbitmq.client.impl.AMQConnection.start(AMQConnection.java:292)
... 8 more
Caused by: java.net.SocketException: Connection reset
at java.net.SocketInputStream.read(SocketInputStream.java:189)
at java.net.SocketInputStream.read(SocketInputStream.java:121)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
at java.io.BufferedInputStream.read(BufferedInputStream.java:265)
at java.io.DataInputStream.readUnsignedByte(DataInputStream.java:288)
at com.rabbitmq.client.impl.Frame.readFrom(Frame.java:95)
at com.rabbitmq.client.impl.SocketFrameHandler.readFrame
(SocketFrameHandler.java:139)
at com.rabbitmq.client.impl.AMQConnection$MainLoop.run(AMQConnection.java:532)
Possible CauseRabbitMQ service not running on the NetWitness Server. 
SolutionRestart the RabbitMQ, SMS, and NetWitness Suite services using the following commands.
systemctl restart rabbitmq-server
systemctl restart rsa-sms
systemctl restart jetty

 

                 
Message/
Problem
User Interface: Cannot connect to System Management Service
CauseThe System Management Service, RabbitMQ, or Mongo service is not running.  
SolutionRun the following commands on NetWitness Server to make sure all these services are running.
[root@nwserver ~]# systemctl status rsa-sms
RSA NetWitness SMS :: Server is not running.
[root@nwserver ~]# systemctl start rsa-sms
Starting RSA NetWitness SMS :: Server...
[root@nwserver ~]# systemctl status rsa-sms
RSA NetWitness SMS :: Server is running (5687).
[root@nwserver ~]# systemctl status mongod
mongod (pid  2779) is running...
systemctl status rabbitmq-server
Status of node nw@localhost ...
[{pid,2501},
 {running_applications,
     [{rabbitmq_federation_management,"RabbitMQ Federation Management",
          "3.3.4"},

 

                 
Message/
Problem
User Interface: Cannot connect to System Management Service
Possible Cause/var/lib/rabbitmq partition usage is 70% or greater. 
SolutionContact Customer Care.

 

                 
Message/
Problem
User Interface: Host migration failed.
Possible CauseOne or more NetWitness Suite services may be in a stopped state.
SolutionMake sure that the following services are running then restart the NetWitness Server:
Archiver, Broker, Concentrator, Decoder, Event Stream Analysis, Response Server, IPDB Extractor, Log Collector, Log Decoder, Malware Analysis, Reporting Engine, Warehouse Connector, Workbench.

 

                 
Message/
Problem
User Interface: Server Unavailable.
Possible CauseOne or more NetWitness Suite services may be in a stopped state.
SolutionMake sure that the following services are running then restart the NetWitness Server:  Archiver, Broker, Concentrator, Decoder, Event Stream Analysis, Response Server, IPDB Extractor, Log Collector, Log Decoder, Malware Analysis, Reporting Engine, Warehouse Connector, Workbench.

 

                         
Message/
Problem
User Interface: Server Unavailable
Possible CauseSystem Management Service (SMS), RabbitMQ, or Mongo service is not running. 
Solution 1Run the following commands on NetWitness Server to make sure all these services are running.
[root@nwserver ~]# systemctl status rsa-sms
RSA NetWitness SMS :: Server is not running.
[root@nwserver ~]# systemctl start rsa-sms
Starting RSA NetWitness SMS :: Server...
[root@nwserver ~]# systemctl status rsa-sms
RSA NetWitness SMS :: Server is running (5687).
[root@nwserver ~]# systemctl status mongod
mongod (pid  2779) is running...
 systemctl status rabbitmq-server
Status of node nw@localhost ...
[{pid,2501},
 {running_applications,
     [{rabbitmq_federation_management,"RabbitMQ Federation Management",
          "3.3.4"},
Solution 2Make sure /var/lib/rabbitmq partition is less than 75% full
Solution 3Check NetWitness Server log files (var/lib/netwitness/uax/logs/nw.log) for any errors.

 

                         
Message/
Problem
ContextHub stops and does not allow you to add or edit data sources and lists.
Possible CauseThe storage is full by 95% or above.
Solution 1

Increase the storage by updating the YML file, located at /etc/netwitness/contexthub-server/ contexthub-server.yml.
For example, to increase storage from 120 to 150 GB, enter a value (in bytes) by editing the relevant parameter: rsa.contexthub.data.disk-size: 161061273600

Solution 2Delete unwanted or unused large list.
Solution 3Configure the TTL index for the list to automatically delete STIX and TAXI data and to clean up storage space.

 

                         
Message/
Problem
Context Hub runs on a fixed memory and 50% is reserved for cache. When cache is 100% full, the cache response stops. For all new lookups the response will be slow.
Possible CauseThe cache is full by 50% or above.
Solution 1By default, Context Hub cleans the cache every 30 minutes. Reduce the cache expiration time of data sources.
Solution 2Disable cache for data sources.
Solution 3

Increase the RAM of the CH Java process by editing the -Xmx option available in the /etc/netwitness/contexthub-server/contexthub-server.conf file. In JAVA_OPTS, search for the -Xmx option.
For example, edit the entry as follows:
-Xmx8G
where 8G represents 8GB space. Then restart the ContextHub service.

Note: The memory is less than the available system memory. Be aware that there are many other services running on the host.

 

                             
Message/
Problem
List Data Source displays an unhealthy stats or status.
Possible Cause 1Unable to:
  • access the data source

  • parse or read a CSV file
  • schema mismatched CSV

Possible Cause 2Unable to authenticate when accessing the data source.
Solution 1Make sure to save the csv file at correct location i.e/var/lib/netwitness/contexthub-server/data/ and verify the required read permissions.
Solution 2Make sure the csv file schema specified while configuring the data source matches. If not, then either create a new data source with the new schema or edit the csv file to match the schema. For example, if you configure a List Data Source with a schema with column1, column2, and column3. And next time you update the csv file where the number of column increase or decrease or the order of the columns are changed. In this case there is a schema mismatch and the configured list data source will show “Unhealthy” in Health and Wellness stats.
Solution 3

Make sure the password is correct. To confirm edit the data source, enter the password and click test connection.

For more information related the above solutions, see Configure Lists as a Data Source topic in the Context Hub Configuration Guide.

Issues Not Identified by the User Interface or Logs

This section provides troubleshooting information for issues that are not identified by messages NetWitness Suite displays in the Health & Wellness Interface or includes in the Health & Wellness log files.  For example, you may see incorrect statistical information in the Interface. 

 

                 
ProblemIncorrect statistics displayed in Health and Wellness interface.
Possible CauseSMS service is not running. SMS service must be running on the NetWitness Server.
SolutionRestart SMS service.

 

                 
ProblemNetWitness Suite does not show the version to which you upgraded until you restart jettysrv  (jeTTy server). 
Possible CauseWhen NetWitness Suite checks a connection, it polls a service every 30 seconds to see if it is active. During that 30 seconds, if the service comes back up, it will not get the new version.
Solution
  1. Manually stop the service.
  2. Wait until you see that it is it offline.
  3. Restart the service.
    NetWitness Suite displays the correct version.

 

                 
ProblemNetWitness Server does not display the Service Unavailable page.
Possible CauseAfter you upgrade to NetWitness Suite version 10.5, JDK 1.8 is not default version and this causes the jettysrv (jeTTy server) to fail to start. Without the jeTTy server, the NetWitness Suite server cannot display the Service Unavailable page. 
SolutionRestart jettysrv.

 

             
ProblemThe SMS service is stopped and the following error is displayed in the log file: java.lang.OutOfMemoryError: Java heap space

Solution

 

You can use the following solution to increase the memory according to your needs.

  1. Open /opt/rsa/sms/conf/wrapper.conf

  2. Replace wrapper.java.additional.1=-Xmx8192m with:
    wrapper.java.additional.1=-Xmx16g

  3. Restart the SMS service:
    systemctl start rsa-sms
You are here
Table of Contents > Monitoring Health and Wellness of NetWitness Suite > Troubleshooting Health & Wellness

Attachments

    Outcomes