Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hi All: I built a cluster to protect oracle database. The oracle db file stored on the drbd(8.3.13) device using protocol A. But sometime oracle can not be failover when the primary node is down. Here is the testing step 1. node A, B, A is primary node, B is secondary node. oracle run on node A and excute a SQL to insert lots of data to oracle . 2. on node B, do the following loop to simulate the situation that node A failed while [ 0 ] ; do #broken net link by iptables #disconnect drbd0 and let it be primary drbdadm disconnect drbd0 drbdadm primary drbd0 #mount and start oracle .... #if start failed , break ... #stop oracle & umount drbd0 #reconnect net link drbdadm connect drbd0 drbdadm -- --discard-my-data connect drbd0 sleep 5 done After several loops, oracle can not be started and the following error occur in alter_<SID>.log ORA-00600: internal error code, arguments: [kcratr_nab_less_than_odr] or ORA-00353: log corruption near block 68622 change 39685781 time 08/25/2012 16:06:42 In oracle's metalink , the first error means that there was a power failure causing logical corruption in controlfile. The second error means that there was a corruption in redo log file How can I avoid there errors and let oracle be failover at any time the primary node crash? Thanks. BTW: protocol A is needed because the cluster running WAN and using a proxy.