|Applies To||RSA Product Set: SecurID|
RSA Product/Service Type: Authentication Manager
RSA Version/Condition: 3.0, 8.1
Platform: RSA SecurID Appliance 250 devices with RAID controllers
|Issue||RSA support may request RAID controller logs if the following occurs:|
- The RSA SecurID Appliance fault LED is on;
- The RAID controller displays a solid amber LED;
- Frequent fsck during reboot: and/or
- The system cannot boot and is not able to mount the Logical Volume Manager (LVM).
|Tasks||The following procedure will help collect MegaRAID controller logs from the command line. Using this process a reboot is not required, unlike gathing this information from the BIOS.|
- Download the attached 8-07-14_MegaCLI.zip file.
- Unzip the file and copy the MegaCli-8.07.14-1.noarch.rpm file to /tmp/ on the server using a secure file copy app, such as scp, WinSCP or FileZilla.
- Once copied the file has been copied, SSH to the appliance and login as emcsrv (for RSA SecurID Appliance 3.0) or as rsaadmin (for Authentication Manager 8.1).
- Run the commands below:
login as: rsaadmin
Using keyboard-interactive authentication.
Password: <enter OS password>
Last login: Wed Oct 7 16:30:21 2015 from jumphost.vcloud.local
RSA Authentication Manager Installation Directory: /opt/rsa/am
rsaadmin@am81p:~> sudo su -
rsaadmin's password: <enter OS password>
am81p:~ # rpm -Uvh /tmp/MegaCli-8.07.07-1.noarch.rpm
am81p:~># alias MegaCli="/opt/MegaRAID/MegaCli/MegaCli64"
am81p:~># MegaCli -EncInfo -aALL | more
am81p:~># MegaCli -AdpBbuCmd -aALL | more
am81p:~># MegaCli -CfgDsplay -aALL | more
am81p:~># MegaCli -AdpAllInfo -aALL | more
am81p:~># MegaCli -LDInfo -Lall -aALL | more
am81p:~># MegaCli -PDList -aALL | more
- Review the logs for any BBU (Backup Battery Unit) failures or ECC (Error-correcting code memory) errors.
- BBU battery is used by the RAID controller to run its built in OS, and save the RAID config, if BBU needs replacement, this will degrade the RAID array.
- ECC technology is a single code correction. If ECC failures happen and persist, this means the memory has failed; hence the controller failed.
- When investigating the errors, look for the following most common errors:
- BBU charging status;
- BBU learn cycle requested (Healthy Battery should be "NO");
- Battery replacement required (Healthy Battery should be "NO");
- Make sure the number of degraded and offline physical drives is zero;
- Make sure the virtual drive status is Optimal (not Degraded);
- Confirm that none of the PDs (Physical Drive) is having the state JBOD, or unconfigured_good or unconfigured_bad; or
- Make sure there is no PFA (Predictive Failures) messages on any of the physical drives.
- The RAID controller is built-in to the RSA SecurID Appliance so if the RAID controller fails, the entire appliance will need to be returned for an RMA.
- See the attachment for a sample output of the above commands.