<html>
<head>
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<div class="moz-cite-prefix">On 16/07/2013 14:55, Brian Candler
wrote:<br>
</div>
<blockquote cite="mid:51E550C4.3030107@pobox.com" type="cite">
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
<br>
* Check /proc/drbd on target, require network is Connected and
local disk is UpToDate. [No check on source?]<br>
* on target: drbdsetup <dev> secondary (just to be sure?).
No wait or status check?<br>
* on both nodes: drbdsetup <dev> disconnect. No wait or
status check?<br>
</blockquote>
Actually it does wait for GetProcStatus().is_standalone (i.e.
connection status StandAlone)<br>
<blockquote cite="mid:51E550C4.3030107@pobox.com" type="cite"> * on
both nodes: drbdsetup <dev> connect. Poll /proc/drbd until
connected or syncing<br>
</blockquote>
More precisely, the code is doing the following on both sides
(roughly simultaneously) to reconnect in multi-master mode:<br>
<br>
drbdsetup <dev> syncer -r 61440 --create-device<br>
drbdsetup <dev> net ipv4:x:x ipv4:y:y C -A
discard-zero-changes -B consensus --create-device -m -a md5 -x
XXXXXX<br>
<br>
You said:<br>
"
<meta charset="utf-8">
Apparently a node was promoted right in the middle of a resync
handshake, and did not like that at all."<br>
<br>
Now, I'm not clear which bit is the "promotion": It looks like
"drbdsetup <dev> connect ... -m" both reconnects *and*
promotes to master in one step.<br>
<br>
Now if there has been a write to the primary disk during the short
time period when the secondary is disconnected from the primary, and
then we reconnect in dual-master mode, then it's expected to do some
resync along with the promotion. This appears to work: if I
configure the VM to write aggressively to disk, then migrate, I see
it goes through a resync phase:<br>
<br>
0: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r-----<br>
0: cs:StandAlone ro:Secondary/Unknown ds:UpToDate/DUnknown r-----<br>
0: cs:WFConnection ro:Secondary/Unknown ds:UpToDate/DUnknown C
r-----<br>
0: cs:SyncTarget ro:Primary/Primary ds:Inconsistent/UpToDate C
r-----<br>
0: cs:Connected ro:Primary/Primary ds:UpToDate/UpToDate C r-----<br>
0: cs:StandAlone ro:Primary/Unknown ds:UpToDate/DUnknown r-----<br>
0: cs:WFBitMapS ro:Primary/Secondary ds:UpToDate/Consistent C
r-----<br>
0: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r-----<br>
<br>
So the race seems to be elsewhere.<br>
<br>
To answer your other question: no I've not tried building any other
version of drbd, I'm just using the stock one in Debian Wheezy.<br>
<br>
Regards,<br>
<br>
Brian.<br>
<br>
</body>
</html>