[DRBD-user] hardware failure on one peer can bring down drbd 0.7.18 on the other

Lars Ellenberg Lars.Ellenberg at linbit.com
Tue May 16 11:34:16 CEST 2006

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


/ 2006-05-16 02:59:04 -0400
\ Maurice Volaski:
> I have (once again) a funky problem with one machine where the underlying hardware RAID device appears to stop responding. The /dev/sdax 
> devices are directly mapped to drbd devices and the configuration is
> 
> disk    { on-io-error detach; }
> 
> Since I'm using protocol C, it appears that drbd won't start I/O on the primary until the secondary has completed. So a freeze on one 
> machine ends up freezing both of them.
> 
> But it appears that problematic computer, which is secondary, has failed severely enough prevent this detachment but still allow drbd's 
> "heartbeat" through and it is actually get that heartbeat.
> 
> That is, I should be getting a ServerForDLess, but I'm not. I think I've come across this situation before, so I don't think it's 
> specific to version 0.7.18.

if lower levels don't report an error back, but just don't report
completion _at all_ we can not do much about it.

> The result of this is essentially the equivalent of a freeze on the primary computer.

read up about ko-count

-- 
: Lars Ellenberg                                  Tel +43-1-8178292-0  :
: LINBIT Information Technologies GmbH            Fax +43-1-8178292-82 :
: Schoenbrunner Str. 244, A-1120 Vienna/Europe   http://www.linbit.com :
__
please use the "List-Reply" function of your email client.



More information about the drbd-user mailing list