[Drbd-dev] DRBD-8: another crash following disk write failures
Graham, Simon
Simon.Graham at stratus.com
Thu Jan 25 22:39:44 CET 2007
Just run into another crash when we get disk write failures - this is related to the other recent problems where we could try to use the activity log after the disk is detached - in this case it's an actual crash rather than a BUG() call:
Unable to handle kernel NULL pointer dereference at virtual address 000000ac
EIP is at w_io_error+0x18/0xa0 [drbd]
Call Trace:
[<c0105431>] show_stack_log_lvl+0xa1/0xe0
[<c0105621>] show_registers+0x181/0x200
[<c0105840>] die+0x100/0x1a0
[<c0115746>] do_page_fault+0x3c6/0x8b1
[<c0105097>] error_code+0x2b/0x30
[<ee3b4b0e>] drbd_worker+0x2de/0x4b5 [drbd]
[<ee3c6eec>] drbd_thread_setup+0x8c/0x100 [drbd]
[<c0102ec5>] kernel_thread_helper+0x5/0x10
And the code in question is this:
int w_io_error(drbd_dev* mdev, struct drbd_work* w,int cancel)
{
drbd_request_t *req = (drbd_request_t*)w;
int ok;
/* FIXME send a "set_out_of_sync" packet to the peer
* in the PassOn case...
* in the Detach (or Panic) case, we (try to) send
* a "we are diskless" param packet anyways, and the peer
* will then set the FullSync bit in the meta data ...
*/
D_ASSERT(mdev->bc->dc.on_io_error != PassOn);
Oops - mdev->bc can be NULL by the time we get here...
I propose simply commenting out the assert for now (patch attached - I left the code there because of the 'FIX ME' line above -- didn't want to lose that!)
Simon
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ava-1617.patch
Type: application/octet-stream
Size: 626 bytes
Desc: ava-1617.patch
Url : http://lists.linbit.com/pipermail/drbd-dev/attachments/20070125/09ab6873/ava-1617.obj
More information about the drbd-dev
mailing list