Article Number
000035922
Applies To
RSA Product Set: SecurID
RSA Product/Service Type: Authentication Manager
RSA Version/Condition: 8.1 or later
Issue
- There are replication failures across the deployment.
Image description
- When checking the service status, all the services are up and running, except RSA Replication (Primary).
- Restarting the primary replication service manually still fails.
cd /opt/rsa/am/server/
./rsaserv restart primary_replication
Starting RSA Replication (Primary) ********************
RSA Replication (Primary) [FAILED]
- The /opt/rsa/am/server/logs/PrimaryReplication.log reports the error:
@@@2018-01-15 23:14:48,897 FATAL [WrapperSimpleAppMain ]
Service.start(98) | <primary_hostname>,,,,Unhandled exception during service start.
java.lang.RuntimeException: Could not acquire lock. Another instance of this service may be running. Aborting...
at com.rsa.replication.util.SingleProcessLock.<init>(SingleProcessLock.java:37)
at com.rsa.replication.util.Service.start(Service.java:91)
at com.rsa.replication.PrimaryReplicationService.main(PrimaryReplicationService.java:136)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.tanukisoftware.wrapper.WrapperSimpleApp.run(WrapperSimpleApp.java:248)
at java.lang.Thread.run(Thread.java:680)
Cause
Running out of disk space causes the service failure in some cases, but there may be other reasons as well, ranging from running out of disk space to memory thread or certificate issues.
Please see knowledge article
000036019 - RSA Authentication Manager 8.x: Large Disk Space Used by Logs to see if archive logs are located in the correct directory.
Resolution
- First, check if there is any disk usage issue and free up the disk space. If there is sufficient disk space, then move to the step 4 instead.
rsaadmin@am82p:~> df -h
Filesystem Size Used Avail Use% Mounted on
rootfs 99G 92G 8G 92% /
udev 4.0G 128K 4.0G 1% /dev
tmpfs 4.0G 12K 4.0G 1% /dev/shm
/dev/sda1 99G 92G 8G 92% /
- Next, retrieve a large archive file size in the server directory:
rsaadmin@am82p:~> ls -lah /opt/rsa/am/server | grep '[0-9][G|M]'
total 910M
-rw------- 1 rsaadmin rsaadmin 151M Mar 1 01:00 system_2017-11-19_0.log
-rw------- 1 rsaadmin rsaadmin 159M Mar 2 01:00 system_2017-11-20_0.log
-rw------- 1 rsaadmin rsaadmin 156M Mar 3 01:00 system_2017-11-21_0.log
-rw------- 1 rsaadmin rsaadmin 175M Mar 4 01:00 system_2017-11-22_0.log
-rw------- 1 rsaadmin rsaadmin 269M Mar 5 01:00 system_2017-11-23_0.log
- Delete files with a large file size. For example, delete system log file ranging in date from 20-Nov 2017 to 23-Nov 2017:
rm /opt/rsa/am/server/system_2017-11-2[0-3]_0.log
- Once disk space is freed, delete the PrimaryReplicationService.lock file in /opt/rsa/am/server/wrapper:
Image description
- Restart the Primary Replication service manually:
cd /opt/rsa/am/server/
./rsaserv restart primary_replication
The service should now start correctly.
- On a successful service start, open the Operations Console. From the Home tab, click the Replication Status Report link, then click the Sync link if replication is shown to be out of sync.
- Ensure the status is updated and shows as being in a Normal state.