000027699 - How to collect log data and restore replication after a replication failure in RSA Authentication Manager 7.1.2 or later on Windows

Document created by RSA Customer Support Employee on Jun 14, 2016Last modified by RSA Customer Support Employee on Apr 21, 2017
Version 2Show Document
  • View in full screen mode

Article Content

Article Number000027699
Applies ToWindows Server 2003
RSA Authentication Manager 7.1 Service Pack 2 (SP2), Service Pack 3 (SP3), and Service Pack 4 (SP4)
Windows Server 2008
IssueCollect log data that can be sent to RSA Customer Service for analysis
Restore replication after a replication failure in RSA Authentication Manager 7.1.2 or later
Replication failed or Replication logs
Replication error
Resolution

How to collect log data and restore replication after a replication failure in RSA Authentication Manager 7.1.2 or later


 


When a replication failure occurs in RSA Authentication Manager, the replica can be deleted and reattached (assuming SP2 or higher has been installed). However, simply reattaching a replica does not address the root cause of a replication failure and does not prevent the same failure from recurring. Also, if the primary is not recoverable, a complete reinstallation of the environment may need to be done. For these reasons, it is important to gather data prior to taking action to restore replication.


 


After the data is collected, you should disable and enable replication. (This is different than pausing and resuming.) Consider that this measure may not restore replication depending on the cause of the replication failure. Also, if you successfully restore replication, this does not mean that the replication issue is resolved.


 


RSA recommends that you identify the cause of a replication failure before you reattach a replica. However, if you decide to reattach a replica without identifying the cause, follow the important instructions at the end of this solution.


 


Note: This procedure instructs you to run RSA utilities. These utilities require the master password for your deployment. You created the master password when you installed the primary.


 


Collecting log data


To collect log data that you can send to Customer Service for analysis:


  1. On the primary, open a command prompt and run the following commands.

    a. Run: 
        cd c:\Program Files\RSA Security\RSA Authentication Manager\utils

    b. Run: 
        rsautil manage-database -a exec-sql -f diagnostics/IMS_RepLogRpt.sql -A log_primary.html -U com.rsa.replication.admin

    c. Run: 
        rsautil manage-database -a exec-sql -f diagnostics/IMS_RepInfoRpt.sql -A info_primary.html -U com.rsa.replication.admin

    d. Run: 
        rsautil manage-database -a exec-sql -f diagnostics/IMS_RepErrorRpt.sql -A error_primary.html -U com.rsa.replication.admin

    e. Run: 
        rsautil manage-database -a exec-sql -f diagnostics/streams_hc_10GR2.sql -U com.rsa.replication.admin


    Some of these commands require the master password for your deployment. Enter the password when prompted. 
     
  2. Collect the files from the utils directory (log_primary.html, info_primary.html, error_primary.html and checkresult.html).
    Note: Modify the file names so that the file name identifies the system where you collected the log data. For example, you can add the local host name to the file name so that it looks similar to the following: checkresult_USIT-RSAAuthMgr4.html
  3. On the replica where the replication failure occurred, run the following commands.

    a. Run: 
        cd c:\Program Files\RSA Security\RSA Authentication Manager\utils

    b. Run: 
        rsautil manage-database -a exec-sql -f diagnostics/IMS_RepLogRpt.sql -A log_replica.html -U com.rsa.replication.admin

    c. Run: 
        rsautil manage-data -a exec-sql -f diagnostics/IMS_RepInfoRpt.sql -A info_replica.html -U com.rsa.replication.admin

    d. Run: 
        rsautil manage-data -a exec-sql -f diagnostics/IMS_RepErrorRpt.sql -A error_replica.html -U com.rsa.replication.admin

    e. Run: 
        rsautil manage-data -a exec-sql -f diagnostics/streams_hc_10GR2.sql -U com.rsa.replication.admin
     
  4. Collect the files from the utils directory (log_replica.html, info_replica.html, error_replica.html and checkresult.html).
    Note: Modify the file names so that the file name identifies the system where you collected the log data. For example, you can add the local host name to the file name so that it looks similar to the following: checkresult_USIT-RSAAuthMgr2.html
  5. On the primary and each replica, do the following:

    a. Run a replication status report. Run: rsautil manage-replication -a report 

    b. Collect the alert log and *.trc files from the point replication failed forward. These files are in: C:\Program Files\RSA Security\RSA Authentication Manager\db\admin\<instance name>\bdump
     
  6. Upload the logs to RSA Customer Support for analysis.

Restarting replication


 


Restarting replication stops and starts the replication processes. This restores replication in many cases, but does not address the cause of the replication failure.


 


Avoid pausing and resuming replication when a replication issue exists. This requires communication between the primary and replicas, and if replication processes are abnormal, then pausing and resuming might take a significant amount of time and often fails.


 


To restart replication (if it is imperative that the replica be restored immediately), run the following commands on the primary and then on each replica.


  1. Run: 
    rsautil manage-database -a exec-sql -U com.rsa.replication.admin -f diagnostics/disable-rep.sql
  2. Run: 
    rsautil manage-database -a exec-sql -U com.rsa.replication.admin -f diagnostics/enable-rep.sql

Reattaching the affected replica


You can reattach the affected replica using the RSA Operations Console. Before you proceed, read the information in the following note.

NotesImportant information about reattaching replicas RSA recommends that you identify the cause of a replication failure before you reattach a replica. When you reattach a replica without knowing the cause of replication issues, consider the following aspects:
  • If the primary has an issue that is not recoverable, deleting a replica removes it as a promotion target.
  • Before you can attach a replica, the primary must have replication functioning normally, and you must do a replication cleanup.
  • Reattaching a replica often defers a replication failure; it does not resolve it.
Important: If the cause of a replication issue is an SP3 or SP4 upgrade failure, do not reattach the replica. The upgrade failure needs to be addressed first.

 


If you decide to reattach a replica without identifying the cause, perform these steps first:


  1. Create a backup of the primary using the RSA Operations Console. If the backup fails, do not delete the replica. Backup and attach both use Oracle Data Pump, and if there is a problem with Data Pump, you cannot attach the replica.
  2. Copy the backup and the secrets file from the primary.
  3. Maintain your license (a *.ZIP file), and all Service Pack and Patch files necessary to rebuild your environment to the same version from which you took your backup.
For information on this issue as it relates to the RSA SecurID Appliance, see solution a51084.
For information on cleaning up after failed replication, see solution a51068.
Note:  Authentication Manager 7.1 before SP4 is no longer covered under Primary Support.
For a similar solution on Linux or Solaris, see A60894
 
Legacy Article IDa51085

Attachments

    Outcomes