[DRBD-user] Bad network connection causing DRBD to freeze

Lars Ellenberg lars.ellenberg at linbit.com
Mon Feb 2 13:51:49 CET 2009

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


On Mon, Feb 02, 2009 at 01:23:51PM +0100, Rainer Sabelka wrote:
> Hi,
> 
> I'm using DRBD (0.8.12) on a pair of servers in separate locations connected 
> by an (almost) dedicated 1GBit ethernet link.
> This connection has become unreliabe in a way that from time to time we see a 
> packet loss up to 30 percent.
> During the these phases of high packet loss, access to the DRBD device blocks 
> for several minutes and the applications accessing the disk become completely 
> unresponsive.
> 
> While we are trying to fix the network connetion in the first place I wonder 
> if I can do something with DRBD to work around this problem.
> 
> From what I see in the logfiles It seems that DRBD detects the network 
> failure, diconnects, and immediately trys to reconnect. Then it stays for 
> several minutes in the WFBitMapS state.
> It seems that any access to the DRBD device during this time blocks until the 
> state SyncSource is reached.
> If the packet loss on the network confinus for a longer periode this 
> disconnect-reconnect cycle repeats several times. 
> The result is that a disturbance in the network connection between the servers 
> basically supends all running services which depend on DRBD.
> 
> To work around the problem I've now put DRBD into stand alone mode.
> Is there anything else I can do about this?

fix the network link?

well, seriously: what would you like DRBD to do?

we are always open for suggestions
 * for scenarios we did not yet think about,
 * for ideas how to "best" handle those scenarios
just mail us a wishlist.

> -Rainer
> 
> ---
> 
> PS: syslog output and drbd.conf:
> 
> on server2 (primary):
> 
> Jan 30 11:21:33 server2 kernel: drbd0: [drbd0_worker/13962] sock_sendmsg time 
> expired, ko = 19

if you configure your ko count smaller,
then if it actually reaches zero,
drbd stays disconnected (StandAlone),
until you tell it to reconnect explicitly.

-- 
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com

DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
__
please don't Cc me, but send to list   --   I'm subscribed



More information about the drbd-user mailing list