000037799 - How to identify a failed hard drive on an RSA SecurID A250 hardware appliance

Document created by RSA Customer Support Employee on Aug 2, 2019
Version 1Show Document
  • View in full screen mode

Article Content

Article Number000037799
Applies ToRSA Product Set : SecurID
RSA Product/Service Type : Authentication Manager
RSA Version/Condition: 8.x
Platform: Hardware Appliance A250
IssueIf you suspect a failed hard drive in an RSA SecurID A250 hardware appliance, it can be tricky to verify and identify the failed drive. 
ResolutionTo check the status of your hard drives: 
  1. Download RAID CMDTool2 for Linux from Intel.
  2. Unzip ir3_CmdTool2_Linux_v8.07.16.zip and copy CmdTool2-8.07.16-1.noarch.rpm to the /tmp/ directory on the appliance using WinSCP, SCP, etc.
  3. Launch an SSH client, such as PuTTY.
  4. Login to the RSA SecurID Appliance as rsaadmin and enter the operating system password.

Note that during Quick Setup another user name may have been selected. Use that user name to login.



  1. Switch to root using the command:


sudo su -


  1. Change the permission on the file to Read|Write|Execute:


chmod 775 /tmp/CmdTool2-8.07.16-1.noarch.rpm


  1. Install the package using the following command:


rpm -ivh /tmp/CmdTool2-8.07.16-1.noarch.rpm


  1. Run the following command to show a summary of your RAID array status


/opt/MegaRAID/CmdTool2/CmdTool264 -ShowSummary -aALL


A healthy output should look like the example below, where the Hardware State is equal to Online and Virtual Drives state is set to Optimal:



System
        Operating System:  Linux version 4.4.156-94.64-default
        Driver Version: 07.701.17.00-rc1
        CLI Version: 8.07.16

Hardware
        Controller
                 ProductName       : Intel(R) Integrated RAID Module RMS25CB080(Bus 0, Dev 0)
                 SAS Address       : 5001e6788a5ef000
                 FW Package Version: 23.12.0-0013
                 Status            : Optimal
        BBU
                 BBU Type          :
                 Status            : Healthy
        Enclosure
                 Product Id        : SGPIO
                 Type              : SGPIO
                 Status            : OK

        PD
                Connector          : Port 0 - 3<Internal>: Slot 1
                Vendor Id          : HITACHI
                Product Id         : HUC10606 CLAR600
                State              : Online
                Disk Type          : SAS,Hard Disk Device
                Capacity           : 557.861 GB
                Power State        : Active

                Connector          : Port 0 - 3<Internal>: Slot 0
                Vendor Id          : HITACHI
                Product Id         : HUC10606 CLAR600
                State              : Online
                Disk Type          : SAS,Hard Disk Device
                Capacity           : 557.861 GB
                Power State        : Active

Storage

       Virtual Drives
                Virtual drive      : Target Id 0 ,VD name
                Size               : 557.861 GB
                State              : Optimal
                RAID Level         : 1


Exit Code: 0x00
NotesPossible states of the virtual drives are:
  • Optimal.  Both hard drives in the array are working properly and data is mirrored between the hard drives.
  • Critical.  One hard drive isn't working as expected, but the array is still operational, but not fully redundant.
  • Rebuilding.   Data is being copied from one hard drive to the other. Array is operational, but not fully redundant.
  • Offline.  Both hard drives aren't working properly and array isn't operational. 
Possible states of the Physical Drive (PD) are:

  • Online.  Hard drive is part of the array and working properly.
  • Unconfigured Good.  Hard drive is OK, but is not part of the array and has no data on it.
  • Unconfigured Good (Foreign).  Hard drive is OK, but it has configuration on it that the RAID controller cannot recognize. 
  • Rebuilding.  Hard drive is OK and is copying data from the other hard drive in the array to achieve full redundancy for the array.
  • Unconfigured Bad.  Hard drive is not OK. Hard drive has a logical failure. 
  • Failed: Hard drive is not OK. It has a physical failure. 
  • Missing: Hard drive has been removed from the server or is facing a physical failure.   

If more data is needed with regard to the RAID status, you can collect RAID logs using the steps in article 000037640 - How to collect RAID logs using Intel RAID CmdTool2 for the RSA SecurID A250 Intel-based Hardware Appliances S2600GZ/GL.

Attachments

    Outcomes