Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
On 06/24/2011 02:40 PM, Lars Ellenberg wrote: >> /etc/drbd.conf: > You cannot configure away users of DRBD. > You need to stop whatever is using DRBD first. > > Maybe start by checking whether "bad things" > happen on shutdown, on reboot, or on both? > > If you do reboot into single user mode, > then manually configure DRBD, > does it stil "detect split brain"? > > If not, shutdown is ok, > and reboot is the problem. > Single user mode was very revealing, thank you for the suggestion Lars. There is a problem in the reboot. Once I got into single user mode, I brought up the networking, loaded the drbd module, ran "drbd up r0" and "drbd primary all" and everything came up flawlessly. The fact that Ubuntu doesn't verify packets are actually being passed when it brings up networking and the fact that I'm using 802.3ad bonding which takes some time to establish is causing the network to be down by the time the drbd init.d script execs. I fixed the problem rather sloppily by inserting the following into the drbd "start" section: while ( ! ping -c 1 peer-ip-address ); do echo peer-ip-address not up done It sticks here until the ethernet ports bond and networking actually comes up. When it does, drbd initializes and goes dual-primary without issue. Is there a better, more elegant way to do this? Does the drbd init.d script do some verification of networking before it attempts bringing up resources? I realize my method is problematic if the other peer is down and a reboot happens. >> resource r0 { >> protocol C; >> startup { >> wfc-timeout 15; >> degr-wfc-timeout 60; > This is very unusual, and likely not what you meant. > typically wfc-timeout is (much) larger than degr-wfc-timeout. > Thanks. I have fixed this. >> } >> net { >> cram-hmac-alg sha1; >> shared-secret "secret"; >> allow-two-primaries; >> after-sb-0pri discard-younger-primary; >> after-sb-1pri discard-secondary; >> after-sb-2pri call-pri-lost-after-sb; > You are aware that you configured automatic data loss there, right? > Just because something is "younger" does not mean a thing. > Just because something is "secondary" *at the point in time of the > connection handshake* does not mean it has bad data. > If the the rebooting peer shuts down properly, there shouldn't be any data written between drbd going down and coming up right? Nothing written, nothing lost is what I'm presuming. > Jun 24 12:30:39 serverb kernel: [ 221.439427] block drbd0: incompatible after-sb-0pri settings > Jun 24 12:30:39 serverb kernel: [ 221.443142] block drbd0: conn( WFReportParams -> Disconnecting ) > > How come the after-sb settings are "incompatible"? > They should be the same on both peers. Fixed this. I think I changed the other peer without properly restarting drbd. Thanks. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 254 bytes Desc: OpenPGP digital signature URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20110624/c86923b1/attachment.pgp>