[DRBD-user] How can web100 block drbd?

Lars Ellenberg Lars.Ellenberg at linbit.com
Tue Apr 4 16:13:51 CEST 2006

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


/ 2006-04-03 02:28:48 -0500
\ Maurice Volaski:
> Web100 is a patch to the Linux kernel (see http://www.web100.org) that enables one to get statistics on all TCP traffic. There was a bug 
> in the previous posted version, 2.5.8, but the author seems to have fixed it.
> 
> So as far I can tell, all network traffic works except for drbd!
> 
> The computer running drbd (0.7.17 under kernel 2.6.15.6) has web100 and displays this message when it should be receiving data from its 
> peer (the acting primary):
> 
> <3>[14938.140724] drbd1: [drbd1_receiver/8822] sock_sendmsg time expired, ko = 4294967295

this message is triggered when
 we have data blocks pending
 but made no progress in sending them for longer than the configured
 "drbd-ping" intervall, which defaults to six seconds, iirc.
 the "knock out counter" is then decremented with each drbd-ping packet
 sent. in case a drbd-ping packet is not answered in time, we declare "network
 failure" and try to reconnect. in case the ping packet is answered, we
 continue to ping untill the ko-count reaches zero, in which case we
 declare the peer's io-subsystem broken, and go StandAlone.
 in case we make some progress (just one data block is enough) we reset
 the ko-count.

> (Interestingly, the peer won't let me access the drbd disks; they appear hung)
> 
> The author is not convinced web100 is responsible. He believes it is
> possible that web100 alters timing that somehow could be triggering a
> bug elsewhere in the kernel, including in drbd.

so this is not a bug in drbd, but web100 alters the timing "somehow"
that it takes several seconds to get one single data block through.

this is also what makes your disks appear hung: they are as slow as the
network connection now, which seems to be throttled by web100 (or your
configuration of web100) down to a couple of bytes every few seconds...

-- 
: Lars Ellenberg                                  Tel +43-1-8178292-0  :
: LINBIT Information Technologies GmbH            Fax +43-1-8178292-82 :
: Schoenbrunner Str. 244, A-1120 Vienna/Europe   http://www.linbit.com :
__
please use the "List-Reply" function of your email client.



More information about the drbd-user mailing list