Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hi, I'm using drbd-8.2.7 with "common { disk { on-io-error detach; } }" and see "drbd3: Local READ failed..." messages even after logs show drbd3 disk state changed to Diskless. It seems drbd did not detached the local drbd3 disk. It is causing load average to increase beyond 40 and the file system stacked on drbd3 to stall waiting for I/O (unacceptable). If not a bug can there be an option to emulate drbd-0.7 behavior to detach local disk immediately on I/O error? Dec 22 10:25:38 node1 kernel: ata2: command 0x25 timeout, stat 0xd0 host_stat 0x1 Dec 22 10:25:38 node1 kernel: ata2: status=0xd0 { Busy } Dec 22 10:25:38 node1 kernel: SCSI error : <1 0 0 0> return code = 0x8000002 Dec 22 10:25:38 node1 kernel: sdb: Current: sense key: Aborted Command Dec 22 10:25:38 node1 kernel: Additional sense: Scsi parity error Dec 22 10:25:38 node1 kernel: end_request: I/O error, dev sdb, sector 4057363 Dec 22 10:25:38 node1 kernel: drbd3: got an _req_mod() errno of -5 Dec 22 10:25:38 node1 kernel: drbd3: Local READ failed sec=1952848s size=4096 Dec 22 10:25:38 node1 kernel: drbd3: disk( UpToDate -> Failed ) Dec 22 10:25:38 node1 kernel: drbd3: Local IO failed. Detaching... Dec 22 10:25:38 node1 kernel: ATA: abnormal status 0xD0 on port 0xE007 Dec 22 10:25:38 node1 last message repeated 2 times Dec 22 10:25:38 node1 kernel: drbd3: disk( Failed -> Diskless ) Dec 22 10:25:38 node1 kernel: drbd3: Notified peer that my disk is broken. ... Dec 22 10:33:07 node1 watchdog[68054]: loadavg 37 24 12 is higher than the given threshold 36 27 18! Dec 22 10:33:07 node1 watchdog[68054]: shutting down the system because of error -3 Dec 22 10:33:08 node1 kernel: ata2: command 0x25 timeout, stat 0xd0 host_stat 0x1 Dec 22 10:33:08 node1 kernel: ata2: status=0xd0 { Busy } Dec 22 10:33:08 node1 kernel: SCSI error : <1 0 0 0> return code = 0x8000002 Dec 22 10:33:08 node1 kernel: sdb: Current: sense key: Aborted Command Dec 22 10:33:08 node1 kernel: Additional sense: Scsi parity error Dec 22 10:33:08 node1 kernel: end_request: I/O error, dev sdb, sector 235310987 Dec 22 10:33:08 node1 kernel: drbd3: got an _req_mod() errno of -5 Dec 22 10:33:08 node1 kernel: drbd3: Local READ failed sec=233206472s size=4096 Dec 22 10:33:08 node1 kernel: ATA: abnormal status 0xD0 on port 0xE007 Dec 22 10:33:08 node1 last message repeated 2 times ... Shutdown/reboot with sync took _very_ long; gets stuck waiting for drbd3! ... Dec 22 11:13:10 node1 kernel: end_request: I/O error, dev sdb, sector 449904539 Dec 22 11:13:10 node1 kernel: drbd3: got an _req_mod() errno of -5 Dec 22 11:13:10 node1 kernel: drbd3: Local READ failed sec=447800024s size=4096 ... Dec 22 11:14:10 node1 kernel: end_request: I/O error, dev sdb, sector 180695108 Dec 22 11:14:10 node1 kernel: drbd3: got an _req_mod() errno of -5 Dec 22 11:14:10 node1 kernel: drbd3: Local WRITE failed sec=178590593s size=512 ... Dec 22 11:20:37 node1 syslogd 1.4.1: restart (remote reception). Dec 22 11:20:37 node1 syslog: syslogd startup succeeded --- common { protocol C; net { sndbuf-size 0; max-buffers 32768; unplug-watermark 2048; timeout 30; connect-int 5; ping-int 5; ko-count 3; after-sb-0pri discard-older-primary; after-sb-1pri consensus; after-sb-2pri violently-as0p; rr-conflict disconnect; } disk { on-io-error detach; } syncer { rate 16M; al-extents 1187; } startup { wfc-timeout 120; degr-wfc-timeout 120; } handlers { before-resync-target "exit 0"; after-resync-target "exit 0"; } } resource drbd3 { on node1 { device /dev/drbd3; disk /dev/sdb2; address ipv4 A.B.C.D:P; meta-disk internal; } on node3 { device /dev/drbd3; disk /dev/sdb2; address ipv4 A.B.C.E:P; meta-disk internal; } disk { size 311385340K; } syncer { rate 16M; after drbd2; } } _________________________________________________________________ Life on your PC is safer, easier, and more enjoyable with Windows Vista®. http://clk.atdmt.com/MRT/go/127032870/direct/01/ -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20081223/df433b7c/attachment.htm>