[DRBD-user] drbd 8.2.1 partially hanging when writing lots of data

Lars Ellenberg lars.ellenberg at linbit.com
Thu Dec 27 20:21:03 CET 2007

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


On Tue, Dec 25, 2007 at 09:05:20PM -0800, Michael Nelson wrote:
> I have just set up DRBD 8.2.1 and am trying it out on Linux 2.6.18 (Xen
> 3.1.0 Dom0) over a gigabit ethernet with ext3. The source machine is a
> Pentium 4 w/ 512MB of RAM, the target is an AMD Athlon XP 3200+ w/ 1GB of
               ^^^
> RAM. Both machines use Intel PRO/1000 MT NICs.
> I am having problems when writing large amounts of data to the drbd device.
> If I write, say, 250MB of data (using dd or perl) in one shot, there are no
                   ^^^
> hangs and I get pretty reasonable performance (~40MB/sec). If I do that
> multiple times within a 5-6 seconds of each other, or I write a lot of data
> (1GB) in one shot, the writes take 2x-10x times longer, with intermittent
> disk activity on the target (it's not sitting there waiting for the disk).

first dd only goes to page cache,
which will only sometime later be flushed to disk.

> Once or twice it has hung so bad, that I've had to reboot both boxes.
> 
> I've looked at netstat -s, and there don't appear to be issues with TCP
> retransmissions. When in this state, I have tried to force resync the target
> (forced overwrite), but /proc/drbd eventually showed both systems stalled
> for good.

do NOT "force sync" if you "get stuck" during normal operation.
that is just nonsense.

my guess is that tcp and page cache (and xen and whatnot) are competing
for memory, and you hit some out-of-memory deadlock/livelock.
what is your device size?

would be interessting to find out where exactly it hangs.
if these are still responsive to the console,
try to get a process list while it "hangs":
 ps -eo pid,state,wchan:40,comm | grep -e "[d]rbd" -e D

> I have tried both protocol B and protocol C and end up with the same basic
> problem.
> 
> I have modified the various performance knobs in drbd.conf as follows:
>    sndbuf-size (default)
>    max-buffers 40000

this is likely too high if you have only 512 MB RAM.
try with max-buffers 4000, max-epoch-size 4000 or less.

>    unplug-watermark 128
>    rate 100M
>    al-extents 3833
> 
> I have just one resource setup thus far.
> 
> Any ideas? If you need any more information, I will be happy to send it.
> 
> Thanks,
> -mike


-- 
: Lars Ellenberg                            Tel +43-1-8178292-55 :
: LINBIT Information Technologies GmbH      Fax +43-1-8178292-82 :
: Vivenotgasse 48, A-1120 Vienna/Europe    http://www.linbit.com :



More information about the drbd-user mailing list