Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
On Thu, 24 Jan 2013, Felix Frank wrote: > On 01/22/2013 05:04 PM, Jacek Osiecki wrote: >> [41706.085879] block drbd0: PingAck did not arrive in time. >> [41706.085888] block drbd0: peer( Primary -> Unknown ) conn( Connected >> -> NetworkFailure ) pdsk( UpToDate -> DUnknown ) >> [41706.086007] block drbd0: new current UUID >> 62770026DDB5FC9D:1AD40906305F01A9:E24FA72FCFB3A8FD:E24EA72FCFB3A8FD > > Uhuh. So, your peer just cut the network connection without > deconfiguring DRBD first (i.e., unmount, go Secondary etc.). OK, this is something :) My first split brain situation was when one of the nodes got instant power down. But later it was recurring even after legitimate reboots... And yes, you're right. It seems, that network was set to shutdown before the DRBD (actually, even before the o2cb or ocfs2 too!). I have set the correct order, but seems that drbd doesn't want to stop: root at oscar ~> /etc/init.d/drbd stop DRBD module version: 8.3.11 userland version: 8.4.1 preferably kernel and userland versions should match. Stopping all DRBD resources: umount: /dev/drbd0: not mounted all: Failure: (127) Device minor not allocated ERROR: Module drbd is in use Retrying once... umount: /dev/drbd0: not mounted all: Failure: (127) Device minor not allocated ERROR: Module drbd is in use . Any ideas? Am I getting it right that if drbd would stop well before network shutdown, the node would reconnect as primary-primary after reboot? > Aside #1 your userland mismatch is rather drastic, you probably want an > 8.3 userland. It's so in my linux distro... I got much newer kernel but seems that still it's only 8.3 drbd in it. > Aside #2, are you sure you need dual-primary operation? Can vservers > live-migrate already and noone told me? ;-) I'm using it for a HA configuration, where two machines are serving the same data through WWW server. Machines are set behind a load balancer. It would be nice if after rebooting any node, it would refresh its data from the other node and then start serving data as usual... Greetings, -- Jacek Osiecki josiecki at silvercube.pl Silvercube s.c. ul. Makuszynskiego 4 31-752 Kraków +48 (12) 684 21 00