[DRBD-user] can drbd be made to detect that it has failed to write to the underlying device in a 'long time'?

Lars Ellenberg Lars.Ellenberg at linbit.com
Tue Apr 13 17:00:31 CEST 2004

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


/ 2004-04-13 07:15:56 -0500
\ Todd Denniston:
> all of my drbd device net sections contain (and did at the time of the lockup
> too):
>     sync-nice  = -1  
>     sync-min    = 1M
>     sync-max    = 20M   # maximal average syncer bandwidth
>     tl-size     = 5000  # transfer log size, ensures strict write ordering
>     timeout     = 60    # unit: 0.1 seconds
>     connect-int = 10    # unit: seconds
>     ping-int    = 10    # unit: seconds
>     ko-count    = 10    # if some block send times out this many times,
>     sync-group  = 0 #Note this changes with each drbd device so they don't
> thrash the heads
> 
> Which I thought meant that in ~60 seconds[1] I would get a fallover.

Ah. No.
This only detects whether I was able to send something to the Secondary.
If not (for that ~60 seconds), I disconnect and ignore my peer.

This does NOT detect local IO failure, since that normally is caught by
the "do-panic" option: when I get a local IO failure, I will just panic
the box. Which should always trigger a failover ...

> What I was suspecting is the 2 drbd's can still talk to one another, and the
> failing node's drbd is blocking (but not failing) on the write to the SCSI
> layer, because the scsi layer is in a loop retrying the reset on the Promise
> box ''hard drive''.
> 
> please clue-by-four me if I am still missing the point here. :)
>
> [1]the way I processed the information:
>  timeout * ko-count = 6.0 seconds * 10 = 60 seconds

[1]: Yes. Plus some offset for the first ping to be send,
     but no more than ping-int.

Hm.
If the Secondary manages to get a "throughput" of more than one block
per (ko-count*ping-intervall), we do not disconnect. If it "even"
manages to get >= (4k) per ping intervall, ko-count won't trigger at all.

So a "very slow" but "not slow enough" write throughput on the Secondary
will throttle the Primary to the same slowness.

On the Primary, if it still is responsive, try to watch the "ns:".
If it still increases, this is what happens.

If you have some "creative" idea how to cope with this, tell us.

In case the Primary is completely unresponsive, try again with 0.6.12.
But if your bdflush or kupdated or similar are currently "stuck" sending
(very slowly) DRBD blocks over to the peer, you need to be very patient...

Philipp: This is why I always wanted to do the sending within a
dedicated thread ... But never mind:

In the same situation, with DRBD 0.7 and linux 2.6., the system will
just spawn an other "bdflush-equivalent" thread, and at least the rest
of the system remains responsive.
But, you know, ... still some bugs hiding in 0.7-pre ...

Though 0.7 CVS seems to be in pretty good shape meanwhile :)

	Lars Ellenberg



More information about the drbd-user mailing list