Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
I wrote about a month ago about issues I am having with DRBD 8.3.7 on CentOS with kernel 2.6.18-164.15.1.el5. -- May 2 06:02:08 ragoon6 kernel: block drbd0: p read: error=-5 May 2 06:02:08 ragoon6 kernel: block drbd0: Local READ failed sec=4211072s size=4096 May 2 06:02:08 ragoon6 kernel: block drbd0: disk( UpToDate -> Failed ) May 2 06:02:08 ragoon6 kernel: block drbd0: Local IO failed in __req_mod.Detaching... May 2 06:02:08 ragoon6 kernel: block drbd0: disk( Failed -> Diskless ) May 2 06:02:08 ragoon6 kernel: block drbd0: Notified peer that my disk is broken. May 2 06:02:09 ragoon6 kernel: block drbd0: 954 messages suppressed in /usr/src/redhat/BUILD/drbd-8.3.7/drbd/drbd_req.c:131. May 2 06:02:09 ragoon6 kernel: block drbd0: Should have called drbd_al_complete_io(, 138163712), but my Disk seems to have failed :( May 2 06:02:09 ragoon6 kernel: block drbd0: Should have called drbd_al_complete_io(, 138163720), but my Disk seems to have failed :( -- -- May 2 06:37:41 ragoon6 kernel: block drbd1: p read: error=-5 May 2 06:37:41 ragoon6 kernel: block drbd1: Local READ failed sec=37749432s size=4096 May 2 06:37:41 ragoon6 kernel: block drbd1: disk( UpToDate -> Failed ) May 2 06:37:41 ragoon6 kernel: block drbd1: Local IO failed in __req_mod.Detaching... May 2 06:37:41 ragoon6 kernel: block drbd1: disk( Failed -> Diskless ) May 2 06:37:41 ragoon6 kernel: block drbd1: Notified peer that my disk is broken. May 2 06:37:42 ragoon6 kernel: block drbd1: Should have called drbd_al_complete_io(, 11421405240), but my Disk seems to have failed :( May 2 06:37:42 ragoon6 kernel: block drbd1: Should have called drbd_al_complete_io(, 11421405248), but my Disk seems to have failed :( May 2 06:37:42 ragoon6 kernel: block drbd1: Should have called drbd_al_complete_io(, 11421405256), but my Disk seems to have failed :( -- I have yet to receive any replies, however, here is some more information. As I suspected, I removed DRBD from my stack, and the error goes away. I repeat, there is no IO error when DRBD is not in use. I have been running like this for nearly 2 weeks. The EIO error only occurs with DRBD, it is apparently not pushed up from an underlying block device. With DRBD in place, I get an EIO error and go diskless within 4 hours. Needless to say, I need this replication in place, and I don't really know what steps to take to find this problem with DRBD. I have used DRBD for many years and not encountered a problem like this one. I would be grateful if someone could suggest some troubleshooting steps, or another replication solution I could try. Thanks.