[DRBD-user] can drbd be made to detect that it has failed to write to the underlying device in a 'long time'?

Tue Apr 13 09:38:46 CEST 2004

/ 2004-04-12 16:51:25 -0500
\ Todd Denniston:
> can drbd be made to detect that it has failed to write to the underlying
> device in a 'long time'?
> I am experiencing a problem where the external raid box I have {Promise
> RM8000} stops responding on the scsi bus and the card {adaptec} is unable to
> reset the Promise box.  
> I was wondering if in this situation where drbd has been unable to actually
> get ANY data synced to the disk on the secondary node (and because the
> secondary node can't sync any data to disk in proto C, the primary is stuck
> too) for about 10 minutes, drbd could be made to consider this a lower level
> failure and do the drbd-panic in 0.6.10 (or other options I believe are
> available in 0.7.x)?

did you look at DRBDs ko-count option?

in your scenario, you should have plenty of 'ko count down' messages in
the syslog. just set it to some value > 0, and when it hits 0,
Primary goes into StandAlone, because it figures that its peer has
severe IO problems, and is unlikely to recover soonish.

once "operator" sorted out the problems, do a "drbd reconnect" on the
StandAlone Primary...

	Lars Ellenberg