[DRBD-user] disconnecting hangs after ko-count failure

Walter Haidinger walter.haidinger at gmx.at
Tue Jan 22 18:35:27 CET 2008

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


> and you are sure that nothing else but the drbd version changed?
> same kernel, same wlan drivers, those metal-wire-fruit baskets
> in the middle of the room did not start to dance, and your neighbor
> still has the same old microwave oven?

Of course there are some other changes, but the only one noticable with regards to drbd is the major version change. The wireless link is run
by two Linksys WRT54GL routers with external antennas running a custom version of OpenWRT. Didn't change anything worth mentioning on the setup there.

> well.
> what about a flood ping with big packets?
> # ping -w 20 -f -s 4100 peer-node
> or saturating your link using dd and netcat...

drbd v7 was saturating the 11 Mbit/s link by syncing about 400GB which obviously took several days. No problem back then. 
Traffic shaping using HTB keeps the link usuable even if drbd uses all available (remaining) bandwidth.

The reason for upgrading to v8 in the first place was just the fact that
openSUSE 10.3 comes with drbd v8 and I tried to use the provided kernel and drbd module on east.

> please do
> # ps -eo pid,state,wchan:30,cmd | grep -e D -e drbd

  171 S drbd_nl_disconnect             [cqueue/1]
 7735 S -                              [drbd0_worker]
13018 D drbd_disconnect                [drbd0_receiver]
21135 S pipe_wait                      grep drbd

The kernel processes hang and thereforce the drbd module can't be unloaded.
Any way to manually kill them?

>  * saturate your network using other means,
>    and see if it does similar things.

I'm currenty using rsync as a (crude ;-) drbd replacement which works without any obvious network problems.

>  * use tcpdump/wirshark, once you see the first "ko-count" message,
>    have a look and have a guess.

Yes, maybe this would identify the source of the ko-counts.

However, I currently do not mind drbd disconnecting because of network problems. Perhaps (probably?) there are some only affecting drbd and not normal ssh or copy operations.

In any case, I'm more concerned about drbd hanging in the disconnecting state, having to reboot(!) to resolve and reconnect. This is the real issue, frequent network disconnects are _not_ my problem (hopefully ;-).

Thank's a lot for your reply.

Regards, Walter
-- 
Der GMX SmartSurfer hilft bis zu 70% Ihrer Onlinekosten zu sparen! 
Ideal für Modem und ISDN: http://www.gmx.net/de/go/smartsurfer



More information about the drbd-user mailing list