Physical Host 10.6.6.x to 11.3 Upgrade: Appendix A. Troubleshooting

Document created by RSA Information Design and Development on Apr 10, 2019Last modified by David O'Malley on Jun 11, 2019
Version 4Show Document
  • View in full screen mode

There two sections in this appendix.

Section 1 - General Troubleshooting information

This section describes solutions to problems that you may encounter during installations and upgrades. In most cases, NetWitness Platform creates log messages when it encounters these problems.

Note: If you cannot resolve an upgrade issue using the following troubleshooting solutions, contact Customer Support (https://community.rsa.com/docs/DOC-1294).

This section has troubleshooting documentation for the following services, features, and processes.

Go to the Master Table of Contents to find all NetWitness Platform Logs & Network 11.x documents.

Command Line Interface (CLI)

Error Message

Command Line Interface (CLI) displays: "Orchestration failed."

Mixlib::ShellOut::ShellCommandFailed: Command execution failed. STDOUT/STDERR suppressed for sensitive resource in/var/log/netwitness/config-management/chef-solo.log

CauseEntered the wrong deploy_admin password in nwsetup-tui.
Solution

Retrieve your deploy_admin password password.

  1. SSH to the NW Server host.
    security-cli-client --get-config-prop --prop-hierarchy nw.security-client --prop-name deployment.password
    SSH to the host that failed.
  2. Run the nwsetup-tui again using correct deploy_admin password.

 

Error MessageERROR com.rsa.smc.sa.admin.web.controller.ajax.health.
AlarmsController - Cannot connect to System Management Service
CauseNetWitness Platform sees the Service Management Service (SMS) as down after successful upgrade even though the service is running.
SolutionRestart SMS service.
systemctl restart rsa-sms

 

Error Message

You receive a message in the User Interface to reboot the host after you update and reboot the host offline.

CauseYou cannot use CLI to reboot the host. You must use the User Interface.
Solution

Reboot the host in the Host View in the User Interface.

Backup (nw-backup script)

Error MessageWARNING: Incorrect ESA Mongo admin password for host <hostname>.
CauseESA Mongo admin password contains special characters (for example, ‘!@#$%^qwerty’).
SolutionChange the ESA Mongo admin password back to the original default of ‘netwitness’ before running backup.

 

Error Backup errors caused by the immutable attribute setting. Here is an example of an error that can be displayed:
CauseIf you have any files that have the immutable flag set (to keep the Puppet process from overwriting a customized file), the file will not be included in the backup process and an error will be generated.
SolutionOn the host that contains the files with the immutable flag set, run the following command to remove the immutable setting from the files:
chattr -i <filename>

 

Error Error creating Network Configuration Information file due to duplicate or bad entries in primary network configuration file:
/etc/sysconfig/network-scripts/ifcfg-em1
Verify contents of /var/netwitness/logdecoder/packetdb/nw-backup/2018-02-23/S5-BROK-36-10.25.53.36-network.info.txt
CauseThere are incorrect or duplicate entries for any one of the following fields: DEVICE, BOOTPROTO, IPADDR, NETMASK or GATEWAY, that were found from reading the primary Ethernet interface configuration file from the host being backed up.
SolutionManually create a file at the backup location on the external backup server, as well as the backup location local to the host where other backups have been staged. The file name should be of the format <hostname>-<hostip>-network.info.txt, and should contain the following entries:
DEVICE=<devicename> ; # from the host's primary ethernet interface config file

BOOTPROTO=<bootprotocol> ; # from the host's primary ethernet interface config file

IPADDR=<value> ; # from the host's primary ethernet interface config file

NETMASK=<value> ; # from the host's primary ethernet interface config file

GATEWAY=<value> ; # from the host's primary ethernet interface config file

search <value> ; # from the host's /etc/resolv.conf file

nameserver <value> ; # from the host's /etc/resolv.conf file

Event Stream Analysis

  • For ESA Correlation troubleshooting information, see the Alerting with ESA Correlation Rules User Guide.
  • For ESA Analytics troubleshooting information, see the Automated Threat Detection Configuration Guide.

Concentrator Service

Problem

After you upgrade to 11.3.0.0, pivot to navigate query fails if the Concentrator service version is 10.6.x.

CausePivot to Navigate query fails as it contains meta entities and 10.6.x Concentrator service does not support meta entities.
SolutionYou must edit the query and remove meta entities. For example, if query is for user then remove the user.all meta entity and re-run the query.

Log Collector Service (nwlogcollector)

Log Collector logs are posted to /var/log/install/nwlogcollector_install.log on the host running the nwlogcollector service.

Error Message<timestamp>.NwLogCollector_PostInstall: Lockbox Status : Failed to open lockbox: The lockbox stable value threshold was not met because the system fingerprint has changed. To reset the system fingerprint, open the lockbox using the passphrase.
CauseThe Log Collector Lockbox failed to open after the update.
SolutionLog in to NetWitness Platform and reset the system fingerprint by resetting the stable system value password for the Lockbox as described in the "Reset the Stable System Value" topic under  "Configure Lockbox Security Settings" topic in the Log Collection Configuration Guide.

 

Error Message<timestamp> NwLogCollector_PostInstall: Lockbox Status : Not Found
CauseThe Log Collector Lockbox is not configured after the update.
SolutionIf you use a Log Collector Lockbox, log in to NetWitness Platform and configure the Lockbox  as described in the "Configure Lockbox Security Settings" topic in the Log Collection Configuration Guide.

 

Error Message<timestamp>: NwLogCollector_PostInstall: Lockbox Status : Lockbox maintenance required: The lockbox stable value threshold requires resetting. To reset the system fingerprint, select Reset Stable System Value on the settings page of the Log Collector.
CauseYou need to reset the stable value threshold field for the Log Collector Lockbox.
SolutionLog in to NetWitness Platform and reset the stable system value password for the Lockbox  as described in "Reset the Stable System Value" topic under  "Configure Lockbox Security Settings" topic in the Log Collection Configuration Guide.

 

ProblemYou have prepared a Log Collector for upgrade and no longer want to upgrade at this time.
CauseDelay in upgrade.
Solution

Use the following command string to revert a Log Collector that has been prepared for upgrade back to resume normal operation.

# /opt/rsa/nwlogcollector/nwtools/prepare-for-migrate.sh --revert

NW Server

These logs are posted to /var/netwitness/uax/logs/sa.log on the NW Server Host.

Problem

After upgrade, you notice that Audit logs are not getting forwarded to the configured Global Audit Setup;

or,

The following message seen in the sa.log.
Syslog Configuration migration failed. Restart jetty service to fix this issue

CauseNW Server Global Audit setup migration failed to migrate from 10.6.6.x to 11.3.0.0.
Solution
  1. SSH to the NW Server.
  2. Submit the following command.
    orchestration-cli-client --update-admin-node

Orchestration

The orchestration server logs are posted to /var/log/netwitness/orchestration-server/orchestration-server.log on the NW Server Host.

Problem
  1. Tried to upgrade a non-NW Server host and it failed.
  2. Retried the upgrade for this host and it failed again.

 

You will see the following message in the orchestration-server.log.
"'file' _virtual_ returned False: cannot import name HASHES""

CauseSalt minion may have been upgraded and never restarted on failed non-NW Server host
Solution
  1. SSH to the non-NW Server host that failed to upgrade.
  2. Submit the following commands.
    systemctl unmask salt-minion
    systemctl restart salt-minion
  3. Retry the upgrade of the non-NW Server host.

Reporting Engine Service 

Reporting Engine Update logs are posted to to/var/log/re_install.log file on the host running the Reporting Engine service.

Error Message<timestamp> : Available free space in /var/netwitness/re-server/rsa/soc/reporting-engine [ ><existing-GB ] is less than the required space [ <required-GB> ]
CauseUpdate of the Reporting Engine failed because you do not have enough disk space. 
SolutionFree up the disk space to accommodate the required space shown in the log message. See the "Add Additional Space for Large Reports" topic in the Reporting Engine Configuration Guide for instructions on how to free up disk space.

NetWitness UEBA

Problem

The User Interface is not accessible.

CauseYou have more than one NetWitness UEBA service existing in your NetWitness deployment and you can only have NetWitness UEBA service in your deployment.
Solution

Complete the following steps to remove the extra NetWitness UEBA service.

  1. SSH to NW Server and run the following commands to query the list of installed NetWitness UEBA services.
    # orchestration-cli-client --list-services|grep presidio-airflow
    ... Service: ID=7e682892-b913-4dee-ac84-ca2438e522bf, NAME=presidio-airflow, HOST=xxx.xxx.xxx.xxx:null, TLS=true
    ... Service: ID=3ba35fbe-7220-4e26-a2ad-9e14ab5e9e15, NAME=presidio-airflow, HOST=xxx.xxx.xxx.xxx:null, TLS=true
  2. From the list of services, determine which instance of the presidio-airflow service should be removed (by looking at the host addresses).

  3. Run the following command to remove the extra service from Orchestration (use the matching service ID from the list of services):
    # orchestration-cli-client --remove-service --id <ID-for-presidio-airflow-form-previous-output>
  4. Run the following command to update node 0 to restore NGINX:
    # orchestration-cli-client --update-admin-node
  5. Log in to NetWitness Platform, go to ADMIN > Hosts, and remove the extra NetWitness UEBA host.

Section 2 - Hardware-Related Troubleshooting Information

Error MessageWhen you restart a Series 4 Appliance with external storage, the following messages are displayed.
Cause

If you upgrade a Series 4 Appliance host with an external storage (for example, a DAC) to 11.2 and try to restart the appliance, the system may recognize it as having a foreign configuration.

Solution
  1. Press the F key and restart the appliance.
    If this successfully imports the configuration and restarts the appliance, you are finished. If it does not work, go to step 3.

  2. Press C to start the Configuration utility.
    1. Select the PERC H8x0 Adapter.

    2. Highlight the top row [for example, PERC H810 Adapter (Bus 65, Dev 0)].

    3. Select Foreign View from the menu bar.

    4. Press F2 to display the Foreign Config drop down menu and select Import.
    5. Select Yes to confirm that you want to import the foreign config.

    6. Verify that there are no more foreign configs present on the system.
    7. Press the Esc key to exit.
    8. Select Yes to confirm that you want to exit.
  3. Press Ctrl-Alt-Delete to restart (reboot) the appliance.

Caution: If the foreign config fails, Contact Customer Support (https://community.rsa.com/docs/DOC-1294).

 

Problem

The mtu.conf and pf_ring files for the 10G Decoder were not restored from the ./etc/init/pfring_bkup directory after upgrade.

Cause

If you use the 10G Decoder hardware driver and you customized the /etc/init.d/pf_ring script to use MTU from the /etc/pf_ring/mtu.conf file, the mtu.conf and pf_ring files from the ./etc/init/pfring_bkup directory are not restored after upgrade.

Solution

Complete the following steps to restore the files.

  1. Restore the pf_ring file to /etc/init.d/ directory in 11.3.
    /etc/init.d/pf_ring
  2. Restore the mtu.conf file to /etc/pf_ring/ directory in 11.3.
    /etc/pf_ring/mtu.conf

 

 

 

 

 

Previous Topic:6. Post Upgrade Tasks
You are here

Table of Contents > A. Troubleshooting

Attachments

    Outcomes