[DRBD-user] "Concurrent local write detected!"
Chris Worley
worleys at gmail.com
Mon Dec 20 21:06:41 CET 2010
I've read in the archives that this is a severe error, even in a
primary/primary setup, but have seen nothing to fix it, and I see them
spew constantly whenever using DRBD, on both primary systems (with GFS
atop or not).
I'm using RHEL5.5/2.6.18-194.3.1.el5 and IB/SDP.
This seems to have eventually lead to the following message spewing on
one primary:
block drbd0: [drbd0_worker/7157] sock_sendmsg time expired, ko = ...
...and later timeouts like (I'm using fio to benchmark):
INFO: task fio:9015 blocked for more than 120 seconds.
fio D ffffffff80150462 0 9015 8972 9016 (NOTLB)
ffff8109c9083d78 0000000000000086 0000000000000000 0000000000000000
0000000000000000 0000000000000007 ffff8108c286c080 ffff8106552750c0
00005f2aba62be3a 0000000000000365 ffff8108c286c268 0000000500000080
Call Trace:
[<ffffffff80063c6f>] __mutex_lock_slowpath+0x60/0x9b
[<ffffffff80063cb9>] .text.lock.mutex+0xf/0x14
[<ffffffff80063c06>] __mutex_unlock_slowpath+0x2a/0x33
[<ffffffff887d346a>] :gfs:__gfs_write+0x82/0xc6
[<ffffffff800eef32>] aio_pwrite+0x2c/0x75
[<ffffffff800ef9f3>] aio_run_iocb+0xef/0x18a
[<ffffffff800f055d>] io_submit_one+0x396/0x499
[<ffffffff800f0b74>] sys_io_submit+0xbe/0x1a4
[<ffffffff8005d116>] system_call+0x7e/0x83
...on the other primary, I see:
block drbd1: helper command: /sbin/drbdadm initial-split-brain minor-1
exit code 0 (0x0)
block drbd1: Split-Brain detected but unresolved, dropping connection!
block drbd1: helper command: /sbin/drbdadm split-brain minor-1
block drbd1: helper command: /sbin/drbdadm split-brain minor-1 exit code 0 (0x0)
block drbd1: conn( NetworkFailure -> Disconnecting )
block drbd1: error receiving ReportState, l: 4!
block drbd1: Connection closed
block drbd1: conn( Disconnecting -> StandAlone )
block drbd1: receiver terminated
block drbd1: Terminating receiver thread
... followed by more "Concurrent local write detected!" then the
"sock_sendmsg time expired, ko =" spewage on that primary too.
The IB link is fine. The split-brain resolution failure is due to the
fencing mechanism not working, but I'm not worried about that yet (it
should have never gotten to the state of detecting split brain): I'm
worried about resolving the "concurrent local write" issue, and
keeping GFS from hanging.
Any ideas?
Thanks,
Chris
More information about the drbd-user
mailing list