Troubleshooting Health & Wellness

Issues Common to All Hosts and Services

You may see the wrong statistics in the Health & Wellness interface if:

  • Some or all the hosts and services are not provisioned and enabled correctly.
  • You have a mixed-version deployment (that is, hosts updated to different NetWitness Platform versions).
  • Supporting services are not running.

Issues Identified by Messages in the Interface or Log Files

This section provides troubleshooting information for issues identified by messages NetWitness Platform displayed in the Health & Wellness Interface or included in the Health & Wellness log files.

Message

User Interface: Cannot connect to System Management Service
System Management Service (SMS) logs:

Caught an exception during connection recovery!
java.io.IOException
at com.rabbitmq.client.impl.AMQChannel.wrap

(AMQChannel.java:106) at com.rabbitmq.client.impl.AMQChannel.wrap

(AMQChannel.java:102) at com.rabbitmq.client.impl.AMQConnection.start(

AMQConnection.java:346) at com.rabbitmq.client.impl.recovery.

RecoveryAwareAMQConnectionFactory.
newConnection

(RecoveryAwareAMQConnectionFactory.java:36)
at com.rabbitmq.client.impl.recovery.

AutorecoveringConnection.
recoverConnection(AutorecoveringConnection.java:388)
at com.rabbitmq.client.impl.recovery.

AutorecoveringConnection.beginAutomaticRecovery(AutorecoveringConnection.java:360)
at com.rabbitmq.client.impl.recovery.AutorecoveringConnection.

access$000(AutorecoveringConnection.java:48)
at com.rabbitmq.client.impl.recovery.

AutorecoveringConnection$1.shutdownCompleted(AutorecoveringConnection.java:345)
at com.rabbitmq.client.impl.ShutdownNotifierComponent.

notifyListeners(ShutdownNotifierComponent.java:75)
at com.rabbitmq.client.impl.AMQConnection$MainLoop.run(AMQConnection.java:572)
at java.lang.Thread.run(Thread.java:745)
Caused by: com.rabbitmq.client.ShutdownSignalException: connection error at com.rabbitmq.utility.ValueOrException.getValue(ValueOrException.java:67)
at com.rabbitmq.utility.BlockingValueOrException.

uninterruptibleGetValueBlockingValueOrException.java:33)
at com.rabbitmq.client.impl.AMQChannel$BlockingRpcContinuation.

getReply
(AMQChannel.java:343)
at com.rabbitmq.client.impl.AMQConnection.start(AMQConnection.java:292)
... 8 more
Caused by: java.net.SocketException: Connection reset
at java.net.SocketInputStream.read(SocketInputStream.java:189)
at java.net.SocketInputStream.read(SocketInputStream.java:121)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
at java.io.BufferedInputStream.read(BufferedInputStream.java:265)
at java.io.DataInputStream.readUnsignedByte(DataInputStream.java:288)
at com.rabbitmq.client.impl.Frame.readFrom(Frame.java:95)
at com.rabbitmq.client.impl.SocketFrameHandler.readFrame
(SocketFrameHandler.java:139)
at com.rabbitmq.client.impl.AMQConnection$MainLoop.run(AMQConnection.java:532)

Possible Cause RabbitMQ service not running on the NetWitness Server.
Solution Restart the RabbitMQ, SMS, and NetWitness Platform services using the following commands.
systemctl restart rabbitmq-server
systemctl restart rsa-sms
systemctl restart jetty

Message/
Problem
User Interface: Cannot connect to System Management Service
Cause The System Management Service, RabbitMQ, or Mongo service is not running.
Solution Run the following commands on NetWitness Server to make sure all these services are running.
[root@nwserver ~]# systemctl status rsa-sms
RSA NetWitness SMS :: Server is not running.
[root@nwserver ~]# systemctl start rsa-sms
Starting RSA NetWitness SMS :: Server...
[root@nwserver ~]# systemctl status rsa-sms
RSA NetWitness SMS :: Server is running (5687).
[root@nwserver ~]# systemctl status mongod
mongod (pid 2779) is running...
systemctl status rabbitmq-server
Status of node nw@localhost ...
[{pid,2501},
{running_applications,
[{rabbitmq_federation_management,"RabbitMQ Federation Management",
"3.3.4"},

Message/
Problem
User Interface: Cannot connect to System Management Service
Possible Cause /var/lib/rabbitmq partition usage is 70% or greater.
Solution Contact Customer Care.

Message/
Problem
User Interface: Host migration failed.
Possible
Cause
One or more NetWitness Platform services may be in a stopped state.
Solution Make sure that the following services are running then restart the NetWitness Server:
Archiver, Broker, Concentrator, Decoder, Event Stream Analysis, Response Server, IPDB Extractor, Log Collector, Log Decoder, Malware Analysis, Reporting Engine, Warehouse Connector, Workbench.

Message/
Problem
User Interface: Server Unavailable.
Possible Cause One or more NetWitness Platform services may be in a stopped state.
Solution Make sure that the following services are running then restart the NetWitness Server: Archiver, Broker, Concentrator, Decoder, Event Stream Analysis, Response Server, IPDB Extractor, Log Collector, Log Decoder, Malware Analysis, Reporting Engine, Warehouse Connector, Workbench.

Message/
Problem
User Interface: Server Unavailable
Possible Cause System Management Service (SMS), RabbitMQ, or Mongo service is not running.
Solution 1 Run the following commands on NetWitness Server to make sure all these services are running.
[root@nwserver ~]# systemctl status rsa-sms
RSA NetWitness SMS :: Server is not running.
[root@nwserver ~]# systemctl start rsa-sms
Starting RSA NetWitness SMS :: Server...
[root@nwserver ~]# systemctl status rsa-sms
RSA NetWitness SMS :: Server is running (5687).
[root@nwserver ~]# systemctl status mongod
mongod (pid 2779) is running...
systemctl status rabbitmq-server
Status of node nw@localhost ...
[{pid,2501},
{running_applications,
[{rabbitmq_federation_management,"RabbitMQ Federation Management",
"3.3.4"},
Solution 2 Make sure /var/lib/rabbitmq partition is less than 75% full
Solution 3 Check NetWitness Server log files (var/lib/netwitness/uax/logs/nw.log) for any errors.

Message/
Problem
ContextHub stops and does not allow you to add or edit data sources and lists.
Possible Cause The storage is full by 95% or above.
Solution 1

Increase the storage by updating the YML file, located at /etc/netwitness/contexthub-server/ contexthub-server.yml.
For example, to increase storage from 120 to 150 GB, enter a value (in bytes) by editing the relevant parameter: rsa.contexthub.data.disk-size: 161061273600

Solution 2 Delete unwanted or unused large list.
Solution 3 Configure the TTL index for the list to automatically delete STIX and TAXI data and to clean up storage space.

Message/
Problem
Context Hub runs on a fixed memory and 50% is reserved for cache. When cache is 100% full, the cache response stops. For all new lookups the response will be slow.
Possible Cause The cache is full by 50% or above.
Solution 1 By default, Context Hub cleans the cache every 30 minutes. Reduce the cache expiration time of data sources.
Solution 2 Disable cache for data sources.
Solution 3

Increase the RAM of the CH Java process by editing the -Xmx option available in the /etc/netwitness/contexthub-server/contexthub-server.conf file. In JAVA_OPTS, search for the -Xmx option.
For example, edit the entry as follows:
-Xmx8G
where 8G represents 8GB space. Then restart the ContextHub service.

Note: The memory is less than the available system memory. Be aware that there are many other services running on the host.

Message/
Problem
List Data Source displays an unhealthy stats or status.
Possible Cause 1 Unable to:
  • access the data source

  • parse or read a CSV file
  • schema mismatched CSV

Possible Cause 2 Unable to authenticate when accessing the data source.
Solution 1 Make sure to save the csv file at correct location i.e/var/lib/netwitness/contexthub-server/data/ and verify the required read permissions.
Solution 2 Make sure the csv file schema specified while configuring the data source matches. If not, then either create a new data source with the new schema or edit the csv file to match the schema. For example, if you configure a List Data Source with a schema with column1, column2, and column3. And next time you update the csv file where the number of column increase or decrease or the order of the columns are changed. In this case there is a schema mismatch and the configured list data source will show “Unhealthy” in Health and Wellness stats.
Solution 3

Make sure the password is correct. To confirm edit the data source, enter the password and click test connection.

For more information related the above solutions, see "Configure Lists as a Data Source" topic in the Context Hub Configuration Guide.

Issues Not Identified by the User Interface or Logs

This section provides troubleshooting information for issues that are not identified by messages NetWitness Platform displays in the Health & Wellness Interface or includes in the Health & Wellness log files. For example, you may see incorrect statistical information in the Interface.

Problem Incorrect statistics displayed in Health and Wellness interface.
Possible Cause SMS service is not running. SMS service must be running on the NetWitness Server.
Solution Restart SMS service.

Problem NetWitness Platform does not show the version to which you upgraded until you restart jettysrv (jeTTy server).
Possible Cause When NetWitness Platform checks a connection, it polls a service every 30 seconds to see if it is active. During that 30 seconds, if the service comes back up, it will not get the new version.
Solution
  1. Manually stop the service.
  2. Wait until you see that it is it offline.
  3. Restart the service.
    NetWitness Platform displays the correct version.

Problem NetWitness Server does not display the Service Unavailable page.
Possible Cause After you upgrade to NetWitness Platform version 10.5, JDK 1.8 is not default version and this causes the jettysrv (jeTTy server) to fail to start. Without the jeTTy server, the NetWitness Platform server cannot display the Service Unavailable page.
Solution Restart jettysrv.

Problem The SMS service is stopped and the following error is displayed in the log file: java.lang.OutOfMemoryError: Java heap space

Solution

You can use the following solution to increase the memory according to your needs.

  1. Open /opt/rsa/sms/conf/wrapper.conf

    netwitness_wrapper-conf.png

  2. Replace wrapper.java.additional.1=-Xmx16g with:
    wrapper.java.additional.1=-Xmx20g

  3. Restart the SMS service:
    systemctl start rsa-sms