This section describes common issues that may occur while using ESA, and it suggests common solutions to these problems.
Troubleshoot ESA Services
Troubleshoot RSA Live Rules for ESA
Step 1: Verify that your Host Is Running
The first step to troubleshooting is to ensure that your host is running. To do this, go to ADMIN > Hosts. If the host is down, the system parameters will not display (updating host information can sometimes be delayed), the Services display in red, and the Updates field displays an error message.
If your host is down, contact your NetWitness Platform Administrator to restart it. Otherwise, go to Step 2.
Step 2: View Detailed Statistics in Health & Wellness
When you are sure your ESA service is down, you can go to Health & Wellness to see where potential issues are occurring. The most common problem is that your ESA service is exceeding memory thresholds, which causes it to stop or fail.
Go to ADMIN > Health & Wellness > Alarms to see if the ESA triggered any alarms. Look for the following alarms:
- ESA Overall Memory Utilization > 85%
- ESA Overall Memory Utilization > 95%
- ESA Service Stopped
Go to ADMIN > Health & Wellness > System Stats Browser to see the memory metrics for each rule's performance. To view the metrics, enter the following:
The memory for each rule is displayed in the Value column, and the value is displayed in bytes. You can view a historical view of memory usage in the Historical Graph column.
Go to ADMIN > Health & Wellness > System Stats Browser to see details of your ESA performance. Select your host, and use the following filters to view the following statistics:
If you are having a problem with memory or CPU utilization, continue to step 3.
Step 3: Bring up your ESA Services
- Go to ADMIN > Services, select your ESA service, and then select > Start.
- Return to the ESA Service to troubleshoot which rules have created memory issues.
If your ESA service is stopping and restarting in a loop, you may need to call Customer Support to get the services to start.
If you are able to start your ESA service without a shutdown, continue to step 4.
Step 4: Check the Alerts and Events Volume
After you are able to restart your ESA service without an immediate shutdown, you can review the stats for your rules to see which rules are consuming too many resources. Sometimes, ESA services fail because a rule is generating too many alerts or a rule is matching too many events. Check for both of these issues if you have determined that memory usage is causing your ESA service to shut down.
View Alert Summaries
Rules that generate a high volume of alerts can overwhelm the system and cause it to fail or restart. To view the alert summaries, go toRESPOND > Alerts. In the Filters panel on the left, in the ALERT NAMES section, select the alert name for the rule. The number of alerts with that name appears at the bottom of the Alerts list results. If the number is significantly high for a particular rule, you need to disable the rule and rewrite it to be more efficient.
To clear your filter, click Reset Filters.
View Events Matched
Sometimes a rule matches too many events, which can use up excessive memory. This typically occurs if you create a large event window where a great number of events accumulate without triggering an alert. These are a problem because each event is stored in memory while the rule waits for the alert to trigger. To check for this issue, go to CONFIGURE > ESA Rules > Services. From there, you can see the number of events that were matched in the Events Matched column. If there was a high number of events matched for a given rule, you can investigate the rule further to see if you can make it more efficient.
Step 5: Disable and Repair the Rule that Caused Issues
Once you have determined the rules that need to be rewritten, disable them and rewrite rules so that they don't generate such a high volume of alerts or events. For pointers on how to write more efficient rules, see Best Practices.
- To disable rules, go to CONFIGURE > ESA Rules > Services, and select the rules you want to disable in the Deployed Rules Stats field.
- Select Disable to disable the rules.
- To repair the rules, go to CONFIGURE > ESA Rules > Rules tab> Rule Library.
- Select the rule to edit and then select > Edit.
- Edit the rule to be more efficient. For instructions on creating rules, see Add Rules to the Rule Library
- When you are satisfied with your rule, you can save the rule as a trial rule to ensure that any memory issues do not affect ESA services performance. To do this, follow the steps listed in Work with Trial Rules.
- To enable rules, go to CONFIGURE > ESA Rules > Services, and select the rules you want to enable in the Deployed Rules Stats field.
- Select Enable to enable the rules.
(Optional) Check the ESA Log Files for More Information
Once you verify that your services are down and some potential causes for the system going down, check to see if the service is stopping and restarting in a loop. To do this, go to the ESA logs. From the ADMIN > Services view, select your ESA service, and then select > View > Logs.
If you cannot access the ESA logs from the NetWitness Platform interface, you can use SSH to get in the system and go to: /opt/rsa/esa/logs/esa.log.