Sys Maintenance: Monitor Health and Wellness Using SNMP Alerts

Document created by RSA Information Design and Development on Sep 14, 2017Last modified by RSA Information Design and Development on Sep 12, 2018
Version 16Show Document
  • View in full screen mode
 

You can monitor an NetWitness Server component to proactively alert using Simple Network Management Protocol (SNMP) based on the thresholds or system failures.

You can monitor the following for NetWitness Platform components: 

  • CPU utilization that reaches a defined threshold.
  • Memory utilization that reaches a defined threshold.
  • Disk utilization that reaches a defined threshold.

SNMP Configuration

The NetWitness Servers can be configured to send out SNMPv3 Threshold Traps and Monitor Traps. Threshold traps are sent in conjunction with configured node thresholds by the NetWitness Platform Core applications themselves. Monitor traps are sent by the SNMP daemon itself for the items indicated in its configuration file. The customer must set up the SNMP daemon on another service to receive SNMP traps from NetWitness Platform. You can set up SNMP on NetWitness Platform in the configuration setting for the NetWitness Server. For more information, see "Service Configuration Settings" in the NetWitness Platform Host and Services Getting Started Guide for the specific host.

Thresholds

Thresholds can be set on any service statistics that can accept the setLimit message. You can retrieve the current thresholds using the getLimit message. To set a limit, you can pass a low and high threshold value.

When the value of the stat crosses either the low or high threshold, a SNMP trap is triggered indicating the threshold is crossed. The trap will not be triggered if the value is below the low and above the high value, but another trap is triggered if it crosses back into the normal range (above the low and below the high).

You must set the threshold for the service using the Service Explorer view or the REST API.

Following is a sample threshold for monitoring CPU usage (below 10% or above 90%):

/sys/stats/cpu setLimit low=10 high=90

Following is an example of how the threshold is set using REST API:

http://<log decoder>:50102/sys/stats/cpu?msg=setLimit&low=10&high=90

If the CPU usage spikes to 90% or higher, a SNMP trap will be generated:

23435333 2013-Dec-16 11:08:35 Threshold warning path=/sys/stats/cpu old=77% new=91

Configure SNMPv3 for a Host

  1. Go to ADMIN > Services.
    The Services view is displayed.
  2. Select the service.
  3. In the Actions column, select View > Explore.
  4. In the nodes list, expand the list and select a config folder. For example, logs > config
  5. Set the SNMPv3 configuration.

    SNMP Configuration

Set the Threshold for a Service

  1. Go to ADMIN > Services.
    The Services view is displayed.
  2. Select the service.
  3. In the Actions column, select View > Explore.
  4. In the nodes list, expand the list and select a stat folder.
  5. Select a stat, for example, cpu, and right-click.
  6. From the drop-down menu, select Properties.

    The Properties panel is displayed. The Properties panel has a drop-down list of available messages for the parameter.

    Properties panel drop-down

  7. Select setLimit.
  8. Specify the low and high values.

SNMP Traps for System Status

The threshold mechanism can also be used to monitor string-valued stats generated by Core services. There are two ways to monitor string-valued stats:

  1. Generate a trap whenever the status value is NOT an expected value. For example, if you want monitor the stat /broker/stats/status and generate a trap whenever the value is not started, set the high limit on the stat to the expected value. You would use the setLimit message on /broker/stats/status as follows:
    setLimit high=started
  2. Generate a trap whenever the status value matches an expected value. This is accomplished by using the low limit on the stat. For example, if you wanted generate a trap when the stat /sys/stats/service.status has the value "Initialization Failure", you would use the setLimit message on /sys/stats/service.status as follows:
    setLimit low="Initialization Failure"

In both of these scenarios, it is possible to check for multiple values by using a comma-separated list of values to check for.

Previous Topic:Monitor Alarms
You are here
Table of Contents > Monitoring Health and Wellness of NetWitness Platform > Monitor Health and Wellness Using SNMP Alerts

Attachments

    Outcomes