<html>
<head>
<style>
.hmmessage P
{
margin:0px;
padding:0px
}
body.hmmessage
{
font-size: 10pt;
font-family:Verdana
}
</style>
</head>
<body class='hmmessage'>
Hi,<br><br>I'm using drbd-8.2.7 with "common { disk { on-io-error detach; } }" and see "drbd3: Local READ failed..." messages even after logs show drbd3 disk state changed to Diskless. It seems drbd did not detached the local drbd3 disk.<br><br>It is causing load average to increase beyond 40 and the file system stacked on drbd3 to stall waiting for I/O (unacceptable).<br><br>If not a bug can there be an option to emulate drbd-0.7 behavior to detach local disk immediately on I/O error?<br><br>Dec 22 10:25:38 node1 kernel: ata2: command 0x25 timeout, stat 0xd0 host_stat 0x1<br>Dec 22 10:25:38 node1 kernel: ata2: status=0xd0 { Busy }<br>Dec 22 10:25:38 node1 kernel: SCSI error : <1 0 0 0> return code = 0x8000002<br>Dec 22 10:25:38 node1 kernel: sdb: Current: sense key: Aborted Command<br>Dec 22 10:25:38 node1 kernel: Additional sense: Scsi parity error<br>Dec 22 10:25:38 node1 kernel: end_request: I/O error, dev sdb, sector 4057363<br>Dec 22 10:25:38 node1 kernel: drbd3: got an _req_mod() errno of -5<br>Dec 22 10:25:38 node1 kernel: drbd3: Local READ failed sec=1952848s size=4096<br>Dec 22 10:25:38 node1 kernel: drbd3: disk( UpToDate -> Failed ) <br>Dec 22 10:25:38 node1 kernel: drbd3: Local IO failed. Detaching...<br>Dec 22 10:25:38 node1 kernel: ATA: abnormal status 0xD0 on port 0xE007<br>Dec 22 10:25:38 node1 last message repeated 2 times<br>Dec 22 10:25:38 node1 kernel: drbd3: disk( Failed -> Diskless ) <br>Dec 22 10:25:38 node1 kernel: drbd3: Notified peer that my disk is broken.<br>...<br>Dec 22 10:33:07 node1 watchdog[68054]: loadavg 37 24 12 is higher than the given threshold 36 27 18!<br>Dec 22 10:33:07 node1 watchdog[68054]: shutting down the system because of error -3<br>Dec 22 10:33:08 node1 kernel: ata2: command 0x25 timeout, stat 0xd0 host_stat 0x1<br>Dec 22 10:33:08 node1 kernel: ata2: status=0xd0 { Busy }<br>Dec 22 10:33:08 node1 kernel: SCSI error : <1 0 0 0> return code = 0x8000002<br>Dec 22 10:33:08 node1 kernel: sdb: Current: sense key: Aborted Command<br>Dec 22 10:33:08 node1 kernel: Additional sense: Scsi parity error<br>Dec 22 10:33:08 node1 kernel: end_request: I/O error, dev sdb, sector 235310987<br>Dec 22 10:33:08 node1 kernel: drbd3: got an _req_mod() errno of -5<br>Dec 22 10:33:08 node1 kernel: drbd3: Local READ failed sec=233206472s size=4096<br>Dec 22 10:33:08 node1 kernel: ATA: abnormal status 0xD0 on port 0xE007<br>Dec 22 10:33:08 node1 last message repeated 2 times<br>...<br>Shutdown/reboot with sync took _very_ long; gets stuck waiting for drbd3!<br>...<br>Dec 22 11:13:10 node1 kernel: end_request: I/O error, dev sdb, sector 449904539<br>Dec 22 11:13:10 node1 kernel: drbd3: got an _req_mod() errno of -5<br>Dec 22 11:13:10 node1 kernel: drbd3: Local READ failed sec=447800024s size=4096<br>...<br>Dec 22 11:14:10 node1 kernel: end_request: I/O error, dev sdb, sector 180695108<br>Dec 22 11:14:10 node1 kernel: drbd3: got an _req_mod() errno of -5<br>Dec 22 11:14:10 node1 kernel: drbd3: Local WRITE failed sec=178590593s size=512<br>...<br>Dec 22 11:20:37 node1 syslogd 1.4.1: restart (remote reception).<br>Dec 22 11:20:37 node1 syslog: syslogd startup succeeded<br><br>---<br><br>common {<br> protocol C;<br> net {<br> sndbuf-size 0;<br> max-buffers 32768;<br> unplug-watermark 2048;<br> timeout 30;<br> connect-int 5;<br> ping-int 5;<br> ko-count 3;<br> after-sb-0pri discard-older-primary;<br> after-sb-1pri consensus;<br> after-sb-2pri violently-as0p;<br> rr-conflict disconnect;<br> }<br> disk {<br> on-io-error detach;<br> }<br> syncer {<br> rate 16M;<br> al-extents 1187;<br> }<br> startup {<br> wfc-timeout 120;<br> degr-wfc-timeout 120;<br> }<br> handlers {<br> before-resync-target "exit 0";<br> after-resync-target "exit 0";<br> }<br>}<br><br>resource drbd3 {<br> on node1 {<br> device /dev/drbd3;<br> disk /dev/sdb2;<br> address ipv4 A.B.C.D:P;<br> meta-disk internal;<br> }<br> on node3 {<br> device /dev/drbd3;<br> disk /dev/sdb2;<br> address ipv4 A.B.C.E:P;<br> meta-disk internal;<br> }<br> disk {<br> size 311385340K;<br> }<br> syncer {<br> rate 16M;<br> after drbd2;<br> }<br>}<br><br /><hr />Life on your PC is safer, easier, and more enjoyable with Windows Vista®. <a href='http://clk.atdmt.com/MRT/go/127032870/direct/01/' target='_new'>See how </a></body>
</html>