[DRBD-user] Re: drbd error -5 and lvm thoughts and observations
Maurice Volaski
mvolaski at aecom.yu.edu
Wed Apr 11 07:18:40 CEST 2007
>So if I use the panic option the kernel would therefore crash forcing a
>failover to occur?
Would you really want that? Drbd is functioning as RAID 1 and when a
"disk" fails, the other "disk" should take over and that's what drbd
is doing. The users can continue working as if nothing had happened.
Then when there is an off hour, you can manually force a failover.
>Is there no other way for heartbeat to monitor the status of the drbd
>devices?
I'm not sure it's heartbeat's responsibility. You could roll your own
logging. For my system, I actually check /proc/drbd for "Diskless"
periodically.
>Basically I ended up with a cluster hang.
>
I'm not 100% certain how that came about. You tried to kill drbd
while the primary was still using it. What I suggest is to wait for
an off hour and then manually stop heartbeat on the primary. That
should cause the heartbeat on the other system to take over cleanly.
Regardless, I think that your real problem is the misbehaving 3ware
card. RAID cards should *never* send SCSI errors up the I/O stack
unless there is a multiple, simultaneous disk failure.
--
Maurice Volaski, mvolaski at aecom.yu.edu
Computing Support, Rose F. Kennedy Center
Albert Einstein College of Medicine of Yeshiva University
More information about the drbd-user
mailing list