[DRBD-user] Please help... After reboot I'm always getting unresolved split brain (DRBD+OCFS2)
ff at mpexnet.de
Thu Jan 24 17:04:14 CET 2013
On 01/22/2013 05:04 PM, Jacek Osiecki wrote:
> [41706.085879] block drbd0: PingAck did not arrive in time.
> [41706.085888] block drbd0: peer( Primary -> Unknown ) conn( Connected
> -> NetworkFailure ) pdsk( UpToDate -> DUnknown )
> [41706.086007] block drbd0: new current UUID
Uhuh. So, your peer just cut the network connection without
deconfiguring DRBD first (i.e., unmount, go Secondary etc.).
This is bad - your local node cannot know what blocks may have been
touched in between connection loss and peer actually unmounting.
I suspect you have to fix this in your shutdown logic.
If that's true though, that's a problem in itself: Dual-primary should
always be run by pacemaker, with working stonith/fencing in place.
Otherwise you set yourself up for a painful actual split-brain including
data loss some day.
Once pacemaker works OK, it will shut down before your network link
does, and will make sure that your DRBD is properly disconnected.
Aside #1 your userland mismatch is rather drastic, you probably want an
Aside #2, are you sure you need dual-primary operation? Can vservers
live-migrate already and noone told me? ;-)
More information about the drbd-user