[Drbd-dev] DRBD-8 crash in drbd_rd_begin_io if connection is
terminated at the wrong time...
Graham, Simon
Simon.Graham at stratus.com
Fri Aug 11 20:17:39 CEST 2006
I haven't been able to figure this one out yet, but when the syncsource
machine disconnects the n/w connection at the wrong time, I get the
attached panic.
What I cant figure out at the moment is exactly which source line
corresponds to the faulting address - the assembler is:
15647: 8b 4d 08 mov 0x8(%ebp),%ecx
1564a: 31 ff xor %edi,%edi
1564c: 8b 81 f0 02 00 00 mov 0x2f0(%ecx),%eax
15652: 8b 50 1c mov 0x1c(%eax),%edx
The fault occurs on the last line and eax is zero, so it's a double
dereference where the 1st pointer is at offset 2f0 in the struct pointed
to by ecx and is NULL -- I'm assuming that something got zeroed as a
result of losing the connection and the rs_begin_io processing is not
smart enough to check... (the disassembled output from objdump -S does
not show any source lines around this, so it's pretty hard to figure out
the actual line of code here).
I'm hoping someone who knows this code better (i.e. Philipp or Lars ;-)
will immediately know what the problem is!
Simon
--------panic info ----------
drbd15: short read expecting header on sock: r=-512
drbd15: tl_clear()
Aug 11 09:40:45 ellwood kernel: drbd15: peer( Secondary -> Unknown )
conn( SyncSource -> StandAlone )
drbd0: short read expecting header on sock: r=-512
Unable to handle kernel NULL pointer dereference at virtual address
0000001c
printing eip:
f128d652
*pde = ma 00000000 pa fffff000
Oops: 0000 [#1]
Modules linked in: drbd ipmi_devintf ipmi_si ipmi_msghandler video
thermal processor fan button battery ac
CPU: 0
EIP: 0061:[<f128d652>] Not tainted VLI
EFLAGS: 00010246 (2.6.16.13-xen0 #2)
EIP is at drbd_rs_begin_io+0x2f2/0x5b0 [drbd]
eax: 00000000 ebx: 00000000 ecx: e6588000 edx: fbfa9000
esi: e658836c edi: 00000000 ebp: e6325f40 esp: e6325ecc
ds: 007b es: 007b ss: 0069
Process drbd0_worker (pid: 13898, threadinfo=e6324000 task=e5811030)
Stack: <0>00004100 e6325ef8 f129154b e6588000 e7b3dc80 00000009 e6325f14
00000020
00000000 ffffffff 00000000 e5811030 c012d570 e6325f20 e6325f20
eaf32440
e6588000 e6325f40 00000000 e5811030 c012d570 e658836c e658836c
ffffffff
Call Trace:
[<c010513a>] show_stack_log_lvl+0xaa/0xe0
[<c010534e>] show_registers+0x18e/0x210
[<c0105549>] die+0xd9/0x180
[<c0112b7c>] do_page_fault+0x3cc/0x640
[<c0104d5f>] error_code+0x2b/0x30
[<f127fc7e>] w_make_resync_request+0xfe/0x310 [drbd]
[<f1281b16>] drbd_worker+0x186/0x4f7 [drbd]
[<f1290f8d>] drbd_thread_setup+0x7d/0xe0 [drbd]
[<c0102d85>] kernel_thread_helper+0x5/0x10
Code: 6c 03 00 00 89 45 d4 89 4d d8 c7 45 dc 70 d5 12 c0 b9 01 00 00 00
8d 55 d4 89 f0 e8 b9 fd e9 ce 8b 4d 08 31 ff 8b 81 f0 02 00 00 <8b> 50
1c 8b 81 f4 02 00 00 31 c9 83 ea 03 39 d0 0f 86 8f 00 00
<0>Fatal exception: panic in 5 seconds
More information about the drbd-dev
mailing list