Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
This is continuation of [heartbeat and drbd / Failover / Failback] but I wanted to change the subject to dopd because I think that is where I'm having a problem. I think I'm starting to get my mind around why and how dopd but currently when the primary node goes down (pull the plug) the secondary node is getting the message to outdate the drbd resource. Here is where I think it is happening on node2 after I pull the plug on node1: ------------------------------------------------------------------- Dec 4 13:22:13 svr92 kernel: drbd0: helper command: /sbin/drbdadm outdate-peer Dec 4 13:22:13 svr92 kernel: drbd0: disk( UpToDate -> Outdated ) Dec 4 13:22:13 svr92 kernel: drbd0: outdate-peer helper broken, returned 255 Dec 4 13:22:13 svr92 kernel: drbd0: State change failed: Refusing to be Primary without at least one UpToDate disk ------------------------------------------------------------------- I can't find any reference to what "returned 255" means but the outdate-peer appears to be broken??? So . . . node1 goes down and somehow in the process outdates node2's resource so heartbeat can't bring it up. There goes my redundancy BUT if node1 really is dead, how can I undo the outdate flag on node2 so I can bring that node up as the primary until I can fix node1? Thoughts? Rois