3

I was performing routine administrative tasks on an Oracle instance and I was not able to connect to it.

I proceeded to do as follows to reach a diagnosis:

  1. I checked for Oracle processes in memory and saw that the instance was down.
  2. I tried to startup the instance but it never responded to the startup command. It simply stalled. No output or feedback, no matter how long I wait. Only a kill -9 can get me out of it.
  3. I reviewed the alert log and the last message was 3 days ago:

    "DBW0: terminating instance due to error 472
    Instance terminated by DBW0, pid = 14952"
    

Questions

  • What can cause BDW to terminate an instance?
  • Why I don't get any feedback, whether in the sqlplus console or the alert log when I try to startup the instance.

When I ran strace, I got this:

ERROR: unable to open /dev/log.

I'm running Oracle on SunOS 5.8 Generic_117350-08 sun4u sparc SUNW,Sun-Fire. The RDBMS version is 9.2.0.8.0

EDIT:

I followed advice from both @balazs-papp and @jsapkota about running truss:

When I run truss on the sqlplus the I try to startup the instance, I get this output:

read(0, " s t a r t u p\n", 1024)               = 8
write(9, "\0 U\0\006\0\0\0\0\0038A".., 85)      = 85
read(10, 0x10029A536, 2064)     (sleeping...)
signotifywait()                 (sleeping...)
lwp_cond_wait(0xFFFFFFFF7D62B058, 0xFFFFFFFF7D62B068, 0xFFFFFFFF7D621C80) (sleeping...)
lwp_cond_wait(0xFFFFFFFF7D62B058, 0xFFFFFFFF7D62B068, 0xFFFFFFFF7D621C80) (sleeping...)
door_return(0x00000000, 0, 0x00000000, 0) (sleeping...)
  • How can I interpret this output? It's totally cryptic to me.

2 Answers2

1

I exected startup nomount with explicit pfile paremeter and it works.

I have the init < SID >.ora in the $ORACLE_HOME/dbs/ directory, and always was working.

0
  1. This is quite obvious from the error.

    00472, 00000, "PMON  process terminated with error"
    // *Cause:  The process cleanup process died
    // *Action: Warm start instance
    

PMON died, so DBWR terminated the instance. If any process dies from SMON, PMON, LGWR, DBWR, CKPT, the remaining processes will terminate the instance. So the real question is, why did PMON die?

  1. Could be anything. Typically environmental/configuration issue. Time to do an strace -f/truss -f, depending on the platform, which we don't know:

    strace -f -o startup.out sqlplus / as sysdba
    SQL> startup
    
Balazs Papp
  • 41,488
  • 2
  • 28
  • 47