000035922 - RSA Authentication Manager 8.x primary replication service is shutdown and fails to start manually

Document created by RSA Customer Support Employee on May 31, 2018
Version 1Show Document
  • View in full screen mode

Article Content

Article Number000035922
Applies ToRSA Product Set: SecurID
RSA Product/Service Type: Authentication Manager
RSA Version/Condition: 8.1 or later
  • There are replication failures across the deployment.

User-added image

  • When checking the service status, all the services are up and running, except RSA Replication (Primary).
  • Restarting the primary replication service manually still fails.

cd /opt/rsa/am/server/
./rsaserv restart primary_replication
Starting RSA Replication (Primary) ********************
RSA Replication (Primary)                                   [FAILED]

  • The /opt/rsa/am/server/logs/PrimaryReplication.log reports the error:

@@@2018-01-15 23:14:48,897 FATAL [WrapperSimpleAppMain ]
Service.start(98) | <primary_hostname>,,,,Unhandled exception during service start.
java.lang.RuntimeException: Could not acquire lock. Another instance of this service may be running. Aborting...
at com.rsa.replication.util.SingleProcessLock.<init>(SingleProcessLock.java:37)
at com.rsa.replication.util.Service.start(Service.java:91)
at com.rsa.replication.PrimaryReplicationService.main(PrimaryReplicationService.java:136)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.tanukisoftware.wrapper.WrapperSimpleApp.run(WrapperSimpleApp.java:248)
at java.lang.Thread.run(Thread.java:680)
CauseRunning out of disk space causes the service failure in some cases, but there may be other reasons as well, ranging from running out of disk space to memory thread or certificate issues.
Please see knowledge article  000036019 - RSA Authentication Manager 8.x: Large Disk Space Used by Logs to see if archive logs are located in the correct directory.
  1. First, check if there is any disk usage issue and free up the disk space. If there is sufficient disk space, then move to the step 4 instead.

rsaadmin@am82p:~> df -h
Filesystem      Size  Used Avail Use% Mounted on
rootfs           99G   92G   8G  92% /
udev            4.0G  128K  4.0G   1% /dev
tmpfs           4.0G   12K  4.0G   1% /dev/shm
/dev/sda1        99G   92G   8G  92% /

  1. Next, retrieve a large archive file size in the server directory:

rsaadmin@am82p:~> ls -lah /opt/rsa/am/server | grep '[0-9][G|M]'
total 910M
-rw-------  1 rsaadmin rsaadmin  151M Mar  1 01:00 system_2017-11-19_0.log
-rw-------  1 rsaadmin rsaadmin  159M Mar  2 01:00 system_2017-11-20_0.log
-rw-------  1 rsaadmin rsaadmin  156M Mar  3 01:00 system_2017-11-21_0.log
-rw-------  1 rsaadmin rsaadmin  175M Mar  4 01:00 system_2017-11-22_0.log
-rw-------  1 rsaadmin rsaadmin  269M Mar  5 01:00 system_2017-11-23_0.log

  1. Delete files with a large file size. For example, delete system log file ranging in date from 20-Nov 2017 to 23-Nov 2017:

rm /opt/rsa/am/server/system_2017-11-2[0-3]_0.log

  1. Once disk space is freed, delete the PrimaryReplicationService.lock file in /opt/rsa/am/server/wrapper:

User-added image

  1. Restart the Primary Replication service manually:

cd /opt/rsa/am/server/
./rsaserv restart primary_replication

The service should now start correctly.

  1. On a successful service start, open the Operations Console.  From the Home tab, click the Replication Status Report link, then click the Sync link if replication is shown to  be out of sync.
  2. Ensure the status is updated and shows as being in a Normal state.