[DRBD-user] Secondary SCSI Errors causing Primary Unresponsiveness

Mon Sep 20 16:58:52 CEST 2004

/ 2004-09-20 10:13:51 -0400
\ Tony Willoughby:
> > for an HA system you also need monitoring.  you monitor the box,
> > you see it has problems, you take it down (out of the cluster at least).
> > and if you had configured it for panic on lower level io error, it
> > should have taken down itself...
> 
> Here is my configuration, is "do-panic" what you are refereeing to?  I
> have that enabled.

yes. then obviously the IO error was not that bad to be passed back up
to DRBD, it probably lead to some retry cycle within the scsi code...
or some such.

> > since 0.6.10 or .12, we have the ko-count.
> > yes we have it in 0.7, too.
> 
> Excellent.  I will dig into ko-count.  Thanks for the tip.
> 
> > 
> > what it means is: if we cannot get any data transfered to our peer,
> > but it still answeres to "drbd ping packets", we normally would retry
> > (actually, tcp will do the retry for us), or continue to wait for ACK
> > packets. but we start the ko count down. once this counter hits zero, we
> > consider the peer dead even though it is still partially responsive, and
> > we do not try to connect there again until explicitly told to do so.
> 
> Any tips on how to tune the ko-count?  

10 will do, probably. if you never saw any "ko-count = 4294967295"
messages yet, you can just put it to 1.  (it defaults to zero, and when
the situation is first encountered, it wraps, thus effectively disabling
the feature, but still screaming about it to syslog)

see, the ping-timeout defaults to 6 seconds (it is 60 centi seconds in
the conf file). ko-count * ping-timeout is the estimate for recognizing
this situation. but, as I said, if there is even only _one_ block
transfered during that time, the connection is kept, and the primary
slows down to the speed of the secondary...

> Any tips on how to simulate a failing disk in the lab?

  LODEV=/dev/sdd
  LOSIZE=$(fdisk -s $LODEV)
  echo "0 $LOSIZE linear $LODEV 0" | dmsetup create r0
  run drbd on top of that (/dev/mapper/r0).

 later, do
  dmsetup suspend r0
  echo "0 $LOSIZE error" | dmsetup reload r0
  dmsetup resume r0

there you go, next request will fail.

or have a look into the
testing/CTH subdirectory of recent drbd tgz.

> > however, if your secondary just becomes very slooow and not fail
> > completely, this mechanism will not work and indeed slow down the
> > primary, too. sorry about that.
> > btw, linbit takes sponsors and support contracts.
> > if you don't think you need our support,
> > think of it as you supporting us instead!
> 
> We have!  :^)

great.

> My company had a service contract with Linbit for several years.

I knew you are a loyal drbd user since years... I did not check your
support status, though.
hopefully others will follow you (again), because, since I do my job too
good on this list for free, we lost several contracts: "why should we
pay for support, when we get it for free on drbd-user"...
some people still don't quite get it :(

anyone has a suggestion for how we can improve on
beeing supported for drbd?

> Thanks for your input Lars.
> 
> > 
> > and yes, 0.7. improves here too, because it has the concept of
> > "NegAck"s and "detaching" the lower level device on io errors,
> > continuing in "diskless" mode. which makes it possible for your
> > monitoring system to do a _graceful_ failover once it recognizes that
> > the primary went into diskless state because of underlying io errors.
> > 
> > we are better than you think...
> > but we have to improve our documentation obviously.

	Lars Ellenberg

-- 
please use the "List-Reply" function of your email client.