000028168 - KB-1270 - How to rebuild Oracle ASM

Document created by RSA Customer Support Employee on Jun 14, 2016Last modified by RSA Customer Support Employee on Apr 21, 2017
Version 2Show Document
  • View in full screen mode

Article Content

Article Number000028168
Resolution

Updated to reflect that this is for ACM 3.6 and 4.x using Oracle 10g.. Oracle 11g uses different processes.


This process should only be used to recover from a catastrophic database failure and only when Aveksa support has validated the problem.


 


This process will destroy all data in the target database.


 


There are several things that need to be in place for the database to come up properly:


  1. Oracle CSSD needs to be running
  2. Oracle Listener needs to be running
  3. Oracle ASM needs to be running
  4. Oracle ASM diskgroup needs to be properly configured
  5. Oracle AVDB needs to be running

Any one of these things can be going wrong. This page will describe how to check each one and to ensure that it is working.


Oracle CSSD


Oracle CSSD is Oracle's clustering services. If this process is not running, the system will not work correctly.


Do this section as root.


Check if it is running properly with the following:


[root@vm-sandbox-dzehme-01 ~]# ps -ef|grep cssd
root      5224  5157  0 14:11 pts/3    00:00:00 grep cssd
root     25513     1  0 Sep04 ?        00:00:00 /bin/su -l oracle -c sh -c 'cd /u01/app/oracle/product/10.2.0/db_1/log/vm-sandbox-dzehme-01/cssd;  ulimit -c unlimited; exec /u01/app/oracle/product/10.2.0/db_1/bin/ocssd '
oracle   25619 25513  0 Sep04 ?        00:00:45 /u01/app/oracle/product/10.2.0/db_1/bin/ocssd.bin

If you don't see ocssd.bin running, you need to make that happen first.

If it is not running, first check /etc/init.d/init.cssd. We have seen cases where this gets truncated to 0 bytes:


[root@vm-sandbox~]# ls -l /etc/init.d/init.cssd
-rwxr-xr-x  1 root root 39611 Sep  4 11:06 /etc/init.d/init.cssd

If this file is 0 bytes, get the file from another machine.

Make sure the start up link is there:


[root@vm-sandbox~]# ls -l /etc/rc3.d/*cssd*
lrwxrwxrwx  1 root root 21 Sep  4 11:06 /etc/rc3.d/S96init.cssd -> /etc/init.d/init.cssd

If this link is missing, recreate it:
[root@vm-sandbox~]# ln -l /etc/init.d/init.cssd /etc/rc3.d/S96init.cssd

Make sure that cssd is started by inittab:


...
l0:0:wait:/etc/rc.d/rc 0
l1:1:wait:/etc/rc.d/rc 1
l2:2:wait:/etc/rc.d/rc 2
h1:35:respawn:/etc/init.d/init.cssd run >/dev/null 2>&1 </dev/null
l3:3:wait:/etc/rc.d/rc 3
l4:4:wait:/etc/rc.d/rc 4
l5:5:wait:/etc/rc.d/rc 5
l6:6:wait:/etc/rc.d/rc 6
...

It must appear before the rc 3 line and not appear at the end of the file. If it is missing or in the wrong place, add or move it to the line before rc 3.
 

If CSSD is not running, it is best to reboot the machine after correcting these items. Make sure the Aveksa does NOT start on reboot with:


chkconfig --levels 345 aveksa_server off
chkconfig --levels 345 aveksa_agent off

Then reboot:
reboot

 


Oracle Listener


The listener is how some things are able to talk to the database.


Do this section as oracle.


Check the status:


[oracle@vm-sandbox]$ lsnrctl status
LSNRCTL for Linux: Version 10.2.0.2.0 - Production on 12-SEP-2008 14:18:05
Copyright (c) 1991, 2005, Oracle.  All rights reserved.
Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=vm-sandbox-dzehme-01.aveksa.local)(PORT=1555)))
STATUS of the LISTENER
------------------------
Alias                     LISTENER
Version                   TNSLSNR for Linux: Version 10.2.0.2.0 - Production
Start Date                05-SEP-2008 11:19:11
Uptime                    7 days 2 hr. 58 min. 53 sec
Trace Level               off
Security                  ON: Local OS Authentication
SNMP                      OFF
Listener Parameter File   /u01/app/oracle/product/10.2.0/db_1/network/admin/listener.ora
Listener Log File         /u01/app/oracle/product/10.2.0/db_1/network/log/listener.log
Listening Endpoints Summary...
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=vm-sandbox-dzehme-01.aveksa.local)(PORT=1555)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=EXTPROC0)))
Services Summary...
Service "AVDB" has 1 instance(s).
  Instance "AVDB", status READY, has 1 handler(s) for this service...
Service "AVDBXDB" has 1 instance(s).
  Instance "AVDB", status READY, has 1 handler(s) for this service...
Service "AVDB_XPT" has 1 instance(s).
  Instance "AVDB", status READY, has 1 handler(s) for this service...
The command completed successfully

In particular, make sure (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=vm-sandbox.aveksa.local)(PORT=1555))) appear with your hostname. This is the main listener port that things need to communicate on.
 

If this does not look right, first check listener.ora


[oracle@vm-sandbox]$ cd $ORACLE_HOME/network/admin
[oracle@vm-sandbox]$ cat listener.ora
# listener.ora Network Configuration File: /u01/app/oracle/db_1//network/admin/listener.ora
# Generated by Oracle configuration tools.
LISTENER =
  (DESCRIPTION_LIST =
    (DESCRIPTION =
      (ADDRESS = (PROTOCOL = TCP)(HOST = vm-sandbox-dzehme-01.aveksa.local)(PORT = 1555))
      (ADDRESS = (PROTOCOL = IPC)(KEY = EXTPROC0))
    )
  )

We have also seen this file truncated to 0 bytes, so if this file is not correct, restore it with the contents here (adjusting the host name).

The listener can be stopped with:


[oracle@vm-sandbox]$ lsnrctl stop
LSNRCTL for Linux: Version 10.2.0.2.0 - Production on 12-SEP-2008 14:23:33
Copyright (c) 1991, 2005, Oracle.  All rights reserved.
Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=vm-sandbox-dzehme-01.aveksa.local)(PORT=1555)))
The command completed successfully

The listener can be started with:


[oracle@vm-sandbox]$ lsnrctl start
LSNRCTL for Linux: Version 10.2.0.2.0 - Production on 12-SEP-2008 14:28:18
Copyright (c) 1991, 2005, Oracle.  All rights reserved.
Starting /u01/app/oracle/product/10.2.0/db_1/bin/tnslsnr: please wait...
TNSLSNR for Linux: Version 10.2.0.2.0 - Production
System parameter file is /u01/app/oracle/product/10.2.0/db_1/network/admin/listener.ora
Log messages written to /u01/app/oracle/product/10.2.0/db_1/network/log/listener.log
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=vm-sandbox-dzehme-01.aveksa.local)(PORT=1555)))
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=EXTPROC0)))
Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=vm-sandbox-dzehme-01.aveksa.local)(PORT=1555)))
STATUS of the LISTENER
------------------------
Alias                     LISTENER
Version                   TNSLSNR for Linux: Version 10.2.0.2.0 - Production
Start Date                12-SEP-2008 14:28:18
Uptime                    0 days 0 hr. 0 min. 0 sec
Trace Level               off
Security                  ON: Local OS Authentication
SNMP                      OFF
Listener Parameter File   /u01/app/oracle/product/10.2.0/db_1/network/admin/listener.ora
Listener Log File         /u01/app/oracle/product/10.2.0/db_1/network/log/listener.log
Listening Endpoints Summary...
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=vm-sandbox-dzehme-01.aveksa.local)(PORT=1555)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=EXTPROC0)))
The listener supports no services
The command completed successfully

You may find after starting, that the listener will not list services for some time (seen this take 1-2 minutes) before it will show database instances. If you are having ASM or AVDB problems, you probably will not see the database instances, but you need the listener working before continuing.


Oracle ASM


First see if you really need to rebuild by running the two tests below:


From the “active” node – make sure the database Oracle processes are running Then run this command


A. See if the Oracle volume is present by running:


sudo service oracleasm listdisks

It should return “VOL1

B. If you run the command sequence:


$ export ORACLE_SID=+ASM
$ asmcmd
ASMCMD> ls

Does it return DG01/?

Typically neither one of the above works - which means the DB needs to be recreated If it returns the correct information - contact Aveksa Support for additional troubleshooting.


 


This section will cause data loss! Unless you have a dump to recover from, do not proceed. It is at this point Oracle support should be consulted for further support.
 

1. Make sure Oracle CSSD is running (see above)


2. Make sure Oracle Listener is running


3. Find the oracle partition


fdisk -l 

the partition for Oracle should be the largest

4. Clean/format the partition:


dd if=/dev/zero of=/dev/<partition> bs=8192 count=12800 

5. Start Oracle (if not auto-started by the reboot)


sudo /etc/init.d/dbora start

6. Create the Oracle Volume:


sudo service oracleasm createdisk VOL1 /dev/<partition>

For example:
sudo service oracleasm createdisk VOL1 /dev/sda3

A. (Pre 3.6) If not done get a release distribution and untar it then deploy the upgrade tools


B. DO THE BELOW ONLY AS THE ORACLE USER!!!


C. In certain situations you may have to perform a step several times Try rebooting and then when the DB comes up perform the failed step


7. Run the script to create +ASM partition by running (pre 3.6):


cd /tmp/postinstall/create_asm
export ORACLE_SID=+ASM
./Create_ASM_Instance.sh

(post 3.6):
cd /home/oracle/deploy/create_asm
export ORACLE_SID=+ASM
./Create_ASM_Instance.sh

Note: you may see an error stating that DG01 can not be deleted. This is OK if the test above shows that there was no DG01 

 

8. Run the script to create the AVDB database by running (pre 3.6):


cd /tmp/postinstall/create_avdb
export ORACLE_SID=AVDB
./Create_AVDB_Instance.sh

(post 3.6):
cd /home/oracle/deploy/create_avdb
export ORACLE_SID=AVDB
./Create_AVDB_Instance.sh

9. Create the Aveksa schema in the database (pre 3.6):


cd /home/oracle/database
./createSchema_V3.5.sh

(post 3.6):
cd /home/oracle/database
./createSchema.sh

10. Load backup if needed


11. Ensure the /etc/oratab is set to start the AVDB and +ASM instance. It should look like this:


+ASM:/u01/app/oracle/product/10.2.0/db_1:Y
AVDB:/u01/app/oracle/product/10.2.0/db_1:Y


The attached document has the details of this process


 


 
Notes 
 

Attachments

    Outcomes