Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
On Mon, Apr 13, 2015 at 05:45:25PM -0600, Fang Sun wrote: > Hi drbd developers, > > After some research and tests I feel I found the reason of this problem and > a possible fix in drbd. > Would you please check if my theory is correct? > > > Let me use 8.4.6 as the code base when I explain it. > When conn_disconnect hang it is hanging at line drbd_receiver.c:5178 > static int drbd_disconnected(struct drbd_peer_device *peer_device) > { > ........ > wait_event(device->misc_wait, !test_bit(BITMAP_IO, &device->flags)); > } > > The reason is device has flag BITMAP_IO set. > > > The reason why flag BITMAP_IO is set and not clear is: > Disk state changes when network is disconnected and after_state_ch is > called. > > At drbd_state.c line 1949 drbd_queue_bitmap_io is called inafter_state_ch() > . > > I think the real reason is in drbd_queue_bitmap_io. drbd_main.c line 3641. > void drbd_queue_bitmap_io(struct drbd_device *device, > int (*io_fn)(struct drbd_device *), > void (*done)(struct drbd_device *, int), > char *why, enum bm_flag flags) > { > ......... > set_bit(BITMAP_IO, &device->flags); > if (atomic_read(&device->ap_bio_cnt) == 0) { > if (!test_and_set_bit(BITMAP_IO_QUEUED, &device->flags)) > drbd_queue_work(&first_peer_device(device)->connection->sender_work, > &device->bm_io_work.w); > } > ........ > } > > In the code the only code to clear BITMAP_IO is in > device->bm_io_work.w(w_bitmap_io). But when > atomic_read(&device->ap_bio_cnt) != 0 the flag BITMAP_IO is set, however > bm_io_work.w is not called. > Then drbd_disconnected() is blocked. > > Should we move set_bit(BITMAP_IO, &device->flags) to the front of > drbd_queue_work()? No. That would be the wrong fix, and cause potential inconsistencies later. It may need to be fixed, but in a different way. Let me (reproduce locally ... and) think about that for a bit. Thanks, -- : Lars Ellenberg : http://www.LINBIT.com | Your Way to High Availability : DRBD, Linux-HA and Pacemaker support and consulting DRBD® and LINBIT® are registered trademarks of LINBIT, Austria. __ please don't Cc me, but send to list -- I'm subscribed