000027688 - How to collect log data and restore replication after a replication failure on an RSA SecurID Appliance 3.0

Document created by RSA Customer Support Employee on Jun 14, 2016Last modified by RSA Customer Support Employee on Apr 21, 2017
Version 4Show Document
  • View in full screen mode

Article Content

Article Number000027688
Applies ToRSA Product Set: SecurID
RSA Product/Service Type: Authentication Manager, SecurID Appliance
RSA Version/Condition: 7.1 Service Pack 2 (SP2), Service Pack 3 (SP3), and Service Pack 4 (SP4) 
IssueCollect log data that can be sent to RSA Customer Service for analysis, and restore replication after a replication failure on an RSA SecurID Appliance 3.0 running RSA Authentication Manager 7.1.2 or later
Replication failed
Replication error or Replication logs
 
ResolutionHow to collect log data and restore replication after a replication failure on an RSA SecurID Appliance 3.0
Note: This procedure instructs you to run RSA utilities. These utilities require the master password for your deployment. You created the master password when ran Quick Setup on the primary Appliance.

Collecting log data
To collect log data that you can send to Customer Service for analysis:
  1. Do the following on the primary Appliance:

    a. Connect to the Appliance using the console or an SSH client. (For remote access using an SSH client, the Appliance must be configured to allow SSH connectivity. For more information, see the RSA Operations Console help.)

    b. Log on to the Appliance using the emcsrv account and the Operating System password. (You created the Operating System password when you ran Quick Setup on the Appliance.)

    c. Switch users to rsaadmin. Run:
      sudo su           (makes you the root superuser)
      su  rsaadmin      (makes you the rsaadmin fileowner)

    When prompted, enter the emcsrv password.

    d. Set the current directory to /usr/local/RSASecurity/RSAAuthenticationManager/utils. Run:
      cd /usr/local/RSASecurity/RSAAuthenticationManager/utils

    e. Run the RSA script that sets environmental variables. Run:
      . ./rsaenv
    Note: The first 4 characters in this command are period, space, period, and forward slash.

    f. Generate the logs. Run the following commands. Enter each command on one line.
     
    ./rsautil manage-database -a exec-sql -f diagnostics/IMS_RepLogRpt.sql -A log_primary.html -U com.rsa.replication.admin

    ./rsautil manage-database -a exec-sql -f diagnostics/IMS_RepInfoRpt.sql -A info_primary.html -U com.rsa.replication.admin

    ./rsautil manage-database -a exec-sql -f diagnostics/IMS_RepErrorRpt.sql -A error_primary.html -U com.rsa.replication.admin

    ./rsautil manage-database -a exec-sql -f diagnostics/streams_hc_10GR2.sql -U com.rsa.replication.admin

    free -m > /tmp/freempri.txt



    df -k > /tmp/dfkpri.txt

    uptime > /tmp/uptimepri.txt

    omreport chassis info > /tmp/omreportpri.txt


    g. Collect the files from the utils directory (log_primary.html, info_primary.html, error_primary.html, and checkresult.html) and copy to the tmp directory.
    Note: Modify the file names so that the file name identifies the system where you collected the log data. For example, you can add the local host name to the file name so that it looks similar to the following: checkresult_USIT-RSAApp2.html
     
  2. Do the following on the replica Appliance where the replication failure occurred:

    a. Connect to the Appliance using the console or an SSH client. (For remote access using an SSH client, the Appliance must be configured to allow SSH connectivity. For more information, see the RSA Operations Console help.)

    b. Log on to the Appliance using the emcsrv account and the Operating System password. (You created the Operating System password when you ran Quick Setup on the Appliance.)

    c. Switch users to rsaadmin. Run:
      sudo su - rsaadmin
    When prompted, enter the emcsrv password.

    d. Set the current directory to /usr/local/RSASecurity/RSAAuthenticationManager/utils. Run:
      cd /usr/local/RSASecurity/RSAAuthenticationManager/utils

    e. Run the RSA script that sets environmental variables. Run:
      . ./rsaenv
    Note: The first 4 characters in this command are period, space, period, forward slash.

    f. Generate the logs. Run the following commands. Enter each command on one line.
     
    ./rsautil manage-database -a exec-sql -f diagnostics/IMS_RepLogRpt.sql -A log_replica.html -U com.rsa.replication.admin

    ./rsautil manage-database -a exec-sql -f diagnostics/IMS_RepInfoRpt.sql -A info_replica.html -U com.rsa.replication.admin

    ./rsautil manage-database -a exec-sql -f diagnostics/IMS_RepErrorRpt.sql -A error_replica.html -U com.rsa.replication.admin

    ./rsautil manage-database -a exec-sql -f diagnostics/streams_hc_10GR2.sql -U com.rsa.replication.admin

    free -m > /tmp/freemrep.txt



    df -k > /tmp/dfkrep.txt

    uptime > /tmp/uptimerep.txt

    omreport chassis info > /tmp/omreportrep.txt


    g. Collect the files from the utils directory (log_replica.html, info_replica.html, error_replica.html, and checkresult.html) and copy to the tmp directory.
    Note: Modify the file names so that the file name identifies the system where you collected the log data. For example, you can add the local host name to the file name so that it looks similar to the following: checkresult_USIT-RSAApp4.html
     
  3. On the primary and each replica, do the following:

    a. Run a replication status report. Run:
      ./rsautil manage-replication -a report

    b. Collect the alert_<instance name>.log file, where <instance name> is an 8-character random-looking name, this file is in:
    /usr/local/RSASecurity/RSAAuthenticationManager/db/admin/<instance name>/bdump 

    c. If the RSA Engineer requests, also collect the  *.trc files from the point replication failed forward, in the same directory as the alert_<instance name> file.
     
  4. Copy all collected files to the /tmp directory. Then exit   (makes you root) , then run chmod on the /tmp directory and set the permissions to 777.
  5. Use a Secure Copy client (such as WinSCP) to copy the files to another system. When you connect to the Appliance, log on using the emcsrv account and the Operating System password.
  6. Upload the logs to RSA Customer Support for analysis.
Restarting replication
Restarting replication stops and starts the replication processes. This restores replication in many cases, but does not address the cause of the replication failure.
Avoid pausing and resuming replication when a replication issue exists. This requires communication between the primary and replicas, and if replication processes are abnormal, then pausing and resuming might take a significant amount of time and often fails.
To restart replication (if it is imperative that the replica be restored immediately), run the following commands on the primary and then on each replica:
 
./rsautil manage-database -a exec-sql -U com.rsa.replication.admin -f diagnostics/disable-rep.sql
./rsautil manage-database -a exec-sql -U com.rsa.replication.admin -f diagnostics/enable-rep.sql


Reattaching the affected replica
You can reattach the affected replica using the RSA Operations Console. Before you proceed, read the information in the following note.
 
NotesImportant information about reattaching replicas
RSA recommends that you identify the cause of a replication failure before you reattach a replica. When you reattach a replica without knowing the cause of replication issues, consider the following aspects:
  • If the primary has an issue that is not recoverable, deleting a replica removes it as a promotion target.
  • Before you can attach a replica, the primary must have replication functioning normally, and you must do a replication cleanup.
  • Reattaching a replica often defers a replication failure; it does not resolve it.
Important: If the cause of a replication issue is an SP3 or SP4 upgrade failure, do not reattach the replica. The upgrade failure needs to be addressed first.
If you decide to reattach a replica without identifying the cause, perform these steps first:
  1. Create a backup of the primary using the RSA Operations Console. If the backup fails, do not delete the replica. Backup and attach both use Oracle Data Pump, and if there is a problem with Data Pump, you cannot attach the replica.
  2. Copy the backup and the secrets file from the primary.
  3. Maintain your license (a *.ZIP file), and all Service Pack and Patch files necessary to rebuild your environment to the same version from which you took your backup.
    Note: If you need to rebuild your environment, someone must have physical access to the each Appliance to restore the factory default settings and run Quick Setup.
For information on this issue as it relates to RSA Authentication Manager on Windows Server 2003, see How to collect log data and restore replication after a replication failure in RSA Authentication Manager 7.1.2 or later on Windows.
For a similar solution on Linux or Solaris, see "How to collect log data and restore replication after a replication failure on AM7.1 on Linux or Solaris
 
Legacy Article IDa51084

Attachments

    Outcomes