[Drbd-dev] DRBD8: drbd nodes deadlock in WFBitMapT
Montrose, Ernest
Ernest.Montrose at stratus.com
Mon Apr 2 23:44:51 CEST 2007
Phil,
OK...I have learned a tad more since that last email. So before I even
try the proposed patch here is a way I was able to duplicate the
problem.
Perhaps that will help a bit. Here it is with two nodes 'a' and 'b' .
Suppose
They are in steady states with UUIDS:
Xa:0:Ha:HH:1:1:0:1:0:0
Xb:0:Hb:HH:1:1:0:1:0:0
Role Secondary/Secondary
1) Disconnect/detach /dev/drbdX on nodea
2) Move Current UUID of nodea to history-UUID of nodea and set
current_UUID of nodea to 00000000000 with drbdmeta..
0:0:Xa:HH:1:1:0:1:0:0
3) Now attach and connect /dev/drbdX and the problem will occur
I have attached the logs for my "manufactured" version of the problem
:). I have some extra instrumentation that I had placed in there so you
can ignore
Them...
I am not sure how we got that way under normal operation..:( One thing
is that
the test we are running calls for "pulling the plug" on one of the node.
So
the reboot was not the result of a graceful shutdown and right before
the reboot, the messed up volume had just written its meta data "to
super block" coming fresh from a resync.
Hope this helps
Thanks a lot...
EM--
-----Original Message-----
From: Philipp Reisner [mailto:philipp.reisner at linbit.com]
Sent: Monday, April 02, 2007 4:00 PM
To: drbd-dev at linbit.com
Cc: Montrose, Ernest
Subject: Re: [Drbd-dev] DRBD8: drbd nodes deadlock in WFBitMapT
Hi Ernest,
I removed all the linebreaks:
Drbd2: Writing metadata to superblock now.
......
....... This nodes is powered of and came back with:
13:16:53 je rct = 0 in
13:16:54 je drbd_sync_handshake:
13:16:54 je self
F71E503A8179BC5D:0000000000000000:3F430D4E1D59C3EA:6B0B1DA5CB20689C
13:16:54 je peer
F71E503A8179BC5C:0000000000000000:3F430D4E1D59C3EA:6B0B1DA5CB20689C
13:16:55 je uuid_compare()=0 by rule 4
<<<<<<<<=========---------------
13:16:55 je No resync, but bits in bitmap!
......
13:17:00 je drbd_sync_handshake:
13:17:00 je self
F71E503A8179BC5D:0000000000000000:3F430D4E1D59C3EA:6B0B1DA5CB20689C
13:17:00 je peer
0000000000000000:0000000000000000:F71E503A8179BC5C:3F430D4E1D59C3EA
13:17:00 je uuid_compare()=-2 by rule 6
13:17:00 je Writing meta data super block now.
13:17:01 je writing of bitmap took 11 jiffies
13:17:01 je 12 GB marked out-of-sync by on disk bit-map.
13:17:02 je 13336132 KB now marked out-of-sync by on disk bit-map.
13:17:02 je Writing meta data super block now.
13:17:02 je uuid[History_start] now F71E503A8179BC5D
13:17:02 je uuid[Current] now 0000000000000000
13:17:03 je conn( Connected -> WFBitMapT )
13:17:03 je Writing meta data super block now.
On theh other node=============================
13:16:48 be aftr_isp( 0 -> 1 )
13:16:48 be Handshake successful: DRBD Network Protocol version 86
13:16:48 be peer( Unknown -> Secondary ) conn( WFReportParams ->
Connected ) pdsk( DUnknown -> UpToDate ) peer_isp( 0 -> 1 )
13:16:48 be Writing meta data super block now.
....
13:16:49 be rct = 2 in
13:16:49 be drbd_sync_handshake:
13:16:49 be self
F71E503A8179BC5C:0000000000000000:3F430D4E1D59C3EA:6B0B1DA5CB20689C
13:16:49 be peer
F71E503A8179BC5D:0000000000000000:3F430D4E1D59C3EA:6B0B1DA5CB20689C
13:16:49 be uuid_compare()=-1 by rule 4
<<<<<<<<=========---------------
13:16:49 be uuid[History_start] now F71E503A8179BC5C
13:16:49 be uuid[Current] now 0000000000000000
13:16:49 be conn( Connected -> WFBitMapT )
13:16:49 be Writing meta data super block now.
Unfortunately the clocks of the nodes seems to be not in sync (please
use NTP, this would make reading such logs much easiert!)
I assume that the line marked with the arrow ( <<==-- ) is
where the troubles begin. Because the 'self' and 'peer' line
is matches here inverse. Then the suspicious decision is
the outcome of uuid_compare() with rule 4.
While je (jerry) comes to the conclusion 0, gets ben -1.
The rtc (a few lines before) suggests us that jerry thought
that neither node was a CRASHED_PRIMARY, while ben thought
that jerry was a CRASHED_PRIMARY. Out of you comment
that jerry was powered off, I guess that jerry was really
a CRASHED_PRIMARY, and therefore it sent this information
over to ben. But somehow that CRASHED_PRIMARY bit got
cleared before jerry came to its drbd_uuid_compare()
function.
The question is: how did that happen ?
In the rest of the log entries on jerry we can see how he
tried to do a second sync handshake ...
Back to the question of the lost CRASHED_PRIMARY bit.
That bits can get cleared when:
1) We attach a disk.
2) Connection state goes to connected.
3) In the receive_bitmap.
I think that the clear_bit in receive_bitmap() is wrong, but
I currently can not think of a situation where this clear_bit
can get executed before the drbd_sync_handshake() function.
Ernest, could you provide a bit more context with a few more
lines of jerry's log before 13:16:53.
So far this my proposed patch. But there is still a missing
link in the reasoning:
--- drbd_receiver.c (revision 2821)
+++ drbd_receiver.c (working copy)
@@ -2509,7 +2509,6 @@
D_ASSERT(h->command == ReportBitMap);
}
- clear_bit(CRASHED_PRIMARY, &mdev->flags); // md_write() is in
drbd_start_resync.
if (mdev->state.conn == WFBitMapS) {
drbd_start_resync(mdev,SyncSource);
} else if (mdev->state.conn == WFBitMapT) {
-------------- next part --------------
drbd2: _drbd_set_state: Changing state.
drbd2: _drbd_set_state: setting conn to SyncSource or SyncTarget see message below
drbd2: peer( Secondary -> Unknown ) conn( Connected -> Disconnecting ) pdsk( UpToDate -> DUnknown )
drbd2: _drbd_set_state: Changing state.
drbd2: _drbd_set_state: setting conn to SyncSource or SyncTarget see message below
drbd2: _drbd_set_state: No state new state to change to.
drbd2: short read expecting header on sock: r=-512
drbd2: asender terminated
drbd2: tl_clear()
drbd2: Connection closed
drbd2: Writing meta data super block now.
drbd2: _drbd_set_state: Changing state.
drbd2: _drbd_set_state: setting conn to SyncSource or SyncTarget see message below
drbd2: conn( Disconnecting -> StandAlone )
drbd2: receiver terminated
drbd2: _drbd_set_state: Changing state.
drbd2: _drbd_set_state: setting conn to SyncSource or SyncTarget see message below
drbd2: disk( UpToDate -> Diskless )
drbd2: drbd_bm_resize called with capacity == 0
drbd2: worker terminated
drbd2: _drbd_set_state: Changing state.
drbd2: _drbd_set_state: setting conn to SyncSource or SyncTarget see message below
drbd2: disk( Diskless -> Attaching )
drbd2: No usable activity log found.
drbd2: max_segment_size ( = BIO size ) = 32768
drbd2: drbd_bm_resize called with capacity == 26672264
drbd2: resync bitmap: bits=3334033 words=104190
drbd2: size = 12 GB (13336132 KB)
drbd2: reading of bitmap took 2 jiffies
drbd2: recounting of set bits took additional 0 jiffies
drbd2: 0 KB marked out-of-sync by on disk bit-map.
drbd2: _drbd_set_state: Changing state.
drbd2: _drbd_set_state: setting conn to SyncSource or SyncTarget see message below
drbd2: disk( Attaching -> UpToDate )
drbd2: Writing meta data super block now.
drbd2: _drbd_set_state: Changing state.
drbd2: _drbd_set_state: setting conn to SyncSource or SyncTarget see message below
drbd2: conn( StandAlone -> Unconnected )
drbd2: receiver (re)started
drbd2: drbd_connect: ****************Connecting********
drbd2: _drbd_set_state: Changing state.
drbd2: _drbd_set_state: setting conn to SyncSource or SyncTarget see message below
drbd2: conn( Unconnected -> WFConnection )
drbd2: _drbd_set_state: Changing state.
drbd2: _drbd_set_state: setting conn to SyncSource or SyncTarget see message below
drbd2: conn( WFConnection -> WFReportParams )
drbd2: Handshake successful: DRBD Network Protocol version 86
drbd2: drbd_connect: ****************sending procol while Connecting********
drbd2: drbd_connect: ****************Sending paran while Connecting********
drbd2: drbd_connect: **************** Sending sizes while Connecting********
drbd2: drbd_connect: ****************sendin uuids while Connecting********
drbd2: drbd_connect: ****************Sending states while Connecting********
drbd2: drbd_connect: ****************DONE!!!! Connecting********
drbd2: drbd_sync_handshake:
drbd2: self 0000000000000000:0000000000000000:B2E340FD0FB4F236:B2E340FD0FB4F219
drbd2: peer B2E340FD0FB4F236:0000000000000000:E515D7C70644AF92:B2E340FD0FB4F219
drbd2: uuid_compare()=-1 by rule 5
drbd2: uuid[Current] now 0000000000000000
drbd2: _drbd_set_state: Changing state.
drbd2: _drbd_set_state: setting conn to SyncSource or SyncTarget see message below
drbd2: peer( Unknown -> Secondary ) conn( WFReportParams -> WFBitMapT ) pdsk( DUnknown -> UpToDate )
drbd2: Writing meta data super block now.
-------------- next part --------------
drbd2: _drbd_set_state: Changing state.
drbd2: _drbd_set_state: setting conn to SyncSource or SyncTarget see message below
drbd2: peer( Secondary -> Unknown ) conn( Connected -> TearDown ) pdsk( UpToDate -> DUnknown )
drbd2: Writing meta data super block now.
drbd2: meta connection shut down by peer.
drbd2: asender terminated
drbd2: tl_clear()
drbd2: Connection closed
drbd2: _drbd_set_state: Changing state.
drbd2: _drbd_set_state: setting conn to SyncSource or SyncTarget see message below
drbd2: conn( TearDown -> Unconnected )
drbd2: receiver terminated
drbd2: receiver (re)started
drbd2: drbd_connect: ****************Connecting********
drbd2: _drbd_set_state: Changing state.
drbd2: _drbd_set_state: setting conn to SyncSource or SyncTarget see message below
drbd2: conn( Unconnected -> WFConnection )
drbd2: _drbd_set_state: Changing state.
drbd2: _drbd_set_state: setting conn to SyncSource or SyncTarget see message below
drbd2: conn( WFConnection -> WFReportParams )
drbd2: Handshake successful: DRBD Network Protocol version 86
drbd2: drbd_connect: ****************sending procol while Connecting********
drbd2: drbd_connect: ****************Sending paran while Connecting********
drbd2: drbd_connect: **************** Sending sizes while Connecting********
drbd2: drbd_connect: ****************sendin uuids while Connecting********
drbd2: drbd_connect: ****************Sending states while Connecting********
drbd2: drbd_connect: ****************DONE!!!! Connecting********
drbd2: drbd_sync_handshake:
drbd2: self B2E340FD0FB4F236:0000000000000000:E515D7C70644AF92:B2E340FD0FB4F219
drbd2: peer 0000000000000000:0000000000000000:B2E340FD0FB4F236:B2E340FD0FB4F219
drbd2: uuid_compare()=-2 by rule 6
drbd2: Writing meta data super block now.
drbd2: writing of bitmap took 4 jiffies
drbd2: 12 GB marked out-of-sync by on disk bit-map.
drbd2: 13336132 KB now marked out-of-sync by on disk bit-map.
drbd2: Writing meta data super block now.
drbd2: uuid[History_start] now B2E340FD0FB4F236
drbd2: uuid[Current] now 0000000000000000
drbd2: _drbd_set_state: Changing state.
drbd2: _drbd_set_state: setting conn to SyncSource or SyncTarget see message below
drbd2: peer( Unknown -> Secondary ) conn( WFReportParams -> WFBitMapT ) pdsk( DUnknown -> UpToDate )
drbd2: Writing meta data super block now.
More information about the drbd-dev
mailing list