[DRBD-user] Reproducible ASSERT( os.conn == C_WF_REPORT_PARAMS )

Brian Candler b.candler at pobox.com
Tue Jul 16 16:32:35 CEST 2013

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


On 16/07/2013 14:55, Brian Candler wrote:
>
> * Check /proc/drbd on target, require network is Connected and local 
> disk is UpToDate. [No check on source?]
> * on target: drbdsetup <dev> secondary (just to be sure?). No wait or 
> status check?
> * on both nodes: drbdsetup <dev> disconnect. No wait or status check?
Actually it does wait for GetProcStatus().is_standalone (i.e. connection 
status StandAlone)
> * on both nodes: drbdsetup <dev> connect. Poll /proc/drbd until 
> connected or syncing
More precisely, the code is doing the following on both sides (roughly 
simultaneously) to reconnect in multi-master mode:

drbdsetup <dev> syncer -r 61440 --create-device
drbdsetup <dev> net ipv4:x:x ipv4:y:y C -A discard-zero-changes -B 
consensus --create-device -m -a md5 -x XXXXXX

You said:
" Apparently a node was promoted right in the middle of a resync 
handshake, and did not like that at all."

Now, I'm not clear which bit is the "promotion": It looks like 
"drbdsetup <dev> connect ... -m" both reconnects *and* promotes to 
master in one step.

Now if there has been a write to the primary disk during the short time 
period when the secondary is disconnected from the primary, and then we 
reconnect in dual-master mode, then it's expected to do some resync 
along with the promotion. This appears to work: if I configure the VM to 
write aggressively to disk, then migrate, I see it goes through a resync 
phase:

  0: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r-----
  0: cs:StandAlone ro:Secondary/Unknown ds:UpToDate/DUnknown   r-----
  0: cs:WFConnection ro:Secondary/Unknown ds:UpToDate/DUnknown C r-----
  0: cs:SyncTarget ro:Primary/Primary ds:Inconsistent/UpToDate C r-----
  0: cs:Connected ro:Primary/Primary ds:UpToDate/UpToDate C r-----
  0: cs:StandAlone ro:Primary/Unknown ds:UpToDate/DUnknown   r-----
  0: cs:WFBitMapS ro:Primary/Secondary ds:UpToDate/Consistent C r-----
  0: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r-----

So the race seems to be elsewhere.

To answer your other question: no I've not tried building any other 
version of drbd, I'm just using the stock one in Debian Wheezy.

Regards,

Brian.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20130716/956a4d83/attachment.htm>


More information about the drbd-user mailing list