000030696 - AM 7.1 SP4 Backup fails, Database corruption (Oracle DB fixes)

Document created by RSA Customer Support Employee on Jun 14, 2016Last modified by RSA Customer Support Employee on Apr 21, 2017
Version 2Show Document
  • View in full screen mode

Article Content

Article Number000030696
Applies ToRSA Product Set: SecurID
RSA Product/Service Type: SecurID Appliance
RSA Version/Condition: 3.0.4, AM 7.1 SP4
Platform: Windows, RPath Linux
Product Name: RSA-0010015
Product Description: RSA SecurID Appliance SW License
IssueAM 7.1 SP4 or Appliance 3.0.4 Backup fails, ORA-39125 or Database corruption (various and last chance Oracle DB fixes).  Any of the following Oracle error messages may be involved.
ORA-39125
ORA-1172:
ORA-01172:
ORA-01109:
ORA-19815:
ORA-19809:
ORA-1507:
ORA-01507:
ORA-16038:
ORA-00312:
ORA-01151:
ORA-00283: recovery session canceled due to errors
ORA-00264: no recovery required
ORA-00313: open failed for members of log group 1 of thread 1
CauseSome corruption in the Oracle database, either a change record is out of order or not enough archive space or shutdown leaves stuck Oracle Processes (PIDs).  This is a very good summary of all known Oracle fixes in AM 7.1 SP4
ResolutionAt some point in these commands we will be stopping and starting the database and/or all services, which would make the primary unavailable.  If you have a Replica authentications would continue working, but any RSA administrators would be kicked out of the Security Console when we did this.
Steps for Appliance or RHEL Linux AM 7.1 SP4 Server:
SSH to Primary Appliance with PuTTY/CigWin or other utility, login with the emcsrv Operating System account.
sudo su rsaadmin
 
Part 1 – restart services making sure Oracle not stuck
cd /usr/local/RSASecurity/RSAAuthenticationManager/utils
./rsautil manage-secrets -a recover
cd /usr/local/RSASecurity/RSAAuthenticationManager/server
./rsaam stop all

 
When all services have stopped, check for left over oracle processes
 
ps -ef | grep ora
If you just see something like "9469  9141  0 16:28 pts/0    00:00:00 grep ora", that is you looking for ora, so there are no stuck Oracle processes
But if you see something like
 
rsaadmin  9655     1  0 May14 ?        00:00:00 ora_q004_oyrpf14j
rsaadmin 11829     1  0 May09 ?        00:00:53 ora_pmon_oyrpf14j
rsaadmin 11831     1  0 May09 ?        00:00:27 ora_psp0_oyrpf14j
rsaadmin 11833     1  0 May09 ?        00:00:08 ora_mman_oyrpf14j
 
or more, then you need to kill these Process IDs or PIDs, which is the # in the second column
 
kill -9 11833
kill -9 11831

 
Sometimes you have to kill every PID, sometimes killing one kills the rest, so every now and again check
ps -ef | grep ora
repeat until done, then start services again
./rsaam start all
 
Part 2 – Typical DB fixes
cd /usr/local/RSASecurity/RSAAuthenticationManager/utils
./rsautil manage-database -a stop-db
./rsautil manage-database -a start-db
./rsautil manage-secrets -a get com.rsa.db.root.password

<copy password>
. ./rsaenv         <dot> <space> <dot <slash> in Linux, rsaenv,cmd in Windows
sqlplus sys/<paste password> as sysdba
SQL>
At the SQL> prompt enter the following commands.  SQL commands end with semi-colon ;
shutdown immediate;
startup mount;
alter database open;
recover database;     If recover not needed you will see ORA-00283: recovery session canceled due to errors, ORA-00264: no recovery required
alter database open;
exit

Try to backup from rsautil
./rsautil manage-backups -a export -f /tmp/bac<date>.dmp
This creates    /tmp/bac<date>.dmp   and    /tmp/bac<date>.secrets, use WinSCP to copy off RHEL Server/appliance, use today's date for <date>
 
If this backup still fails, try the following
 
Part 3 – Less Typical DB fixes to try is above did not work
cd /usr/local/RSASecurity/RSAAuthenticationManager/utils
./rsautil manage-database -a stop-db
./rsautil manage-database -a start-db
./rsautil manage-secrets -a get com.rsa.db.root.password

<copy password>
. ./rsaenv         <dot> <space> <dot <slash> in Linux, rsaenv,cmd in Windows
sqlplus sys/<paste password> as sysdba
SQL>
At the SQL> prompt enter the following commands.  SQL commands end with semi-colon ;
shutdown immediate;
startup mount;
alter database clear unarchived logfile group 1;
alter database clear unarchived logfile group 2;
alter database clear unarchived logfile group 3;
alter database open;
shutdown immediate ;
startup;
exit

may need to restart DB from rsautils instead of from ./rsaam
./rsautil manage-database -a stop-db
./rsautil manage-database -a start-db

Try to backup from rsautil
./rsautil manage-backups -a export -f /tmp/bac<date>.dmp
This creates    /tmp/bac<date>.dmp   and    /tmp/bac<date>.secrets, use WinSCP to copy off RHEL Server/appliance, use today's date for <date>
 
If this backup still fails, try the following
cd /usr/local/RSASecurity/RSAAuthenticationManager/utils
./rsautil manage-secrets -a get com.rsa.db.root.password

<copy password>
. ./rsaenv         <dot> <space> <dot <slash> in Linux, rsaenv,cmd in Windows
sqlplus sys/<paste password> as sysdba
SQL>
At the SQL> prompt enter the following commands.  SQL commands end with semi-colon ;
shutdown immediate;
startup nomount;
alter system set db_recovery_file_dest_size=160G scope=both;
alter database mount;
alter database open;
exit

Try to backup from rsautil
./rsautil manage-backups -a export -f /tmp/bac<date>.dmp
This creates    /tmp/bac<date>.dmp   and    /tmp/bac<date>.secrets, use WinSCP to copy off RHEL Server/appliance, use today's date for <date>
 
If this backup still fails, you are Out of Luck.

 
Workaroundrebuild from scratch
NotesMay need to delete \Windows\Temp\*.sql files on Win2003 Server or delete / rm 
<RSA_Home>/db/<sid>/bdump/*.trc 

Attachments

    Outcomes