 On 09/23/2010 02:52 AM, Alex Adriaanse wrote:
> These hangs seemed to coincide with times when there was a large spike
> in memory consumption and much of the server's physical memory was used
> up, resulting in a significant increase in swapping.  The server has 4GB
> of RAM and 3GB of swap space.  The most swap space I've seen in use was
> 0.9GB (during the periods of heavy memory consumption).  However, I
> wasn't able to measure actual swap usage during these freezes, so I
> can't confirm this correlation or the exact swap usage during the
> freezes.
I have seen this problem as well. You could try, as a workaround, to
add  "data-integrity-alg crc32c;" to the net section. This resets the
connection if an error is detected and reconnects again. Since I have
done that, the timeouts are gone. But I see several times a day "Digest
integrity check FAILED" messages. This seems to happen at high IO and
CPU load. I posted a message before on this list on how I can reproduce.
Just run "pi 22" with a /tmp partition that is backed by drbd (that's
not your setup as I read).

> Some background information: the /usr, /var, /var/log, /home, and /srv
> filesystems run off various DRBD devices, which use LVM logical volumes
> as the underlying storage, which in turn uses two hard drives mirrored
> using MD RAID1 as its physical volumes.  The DRBD devices are configured
> as Primary role, with the Secondary server being connected over a long
> distance link.  The root filesystem, /tmp, and swap bypass DRBD and use
> LVM logical volume directly.  These logical volumes reside on the same
> physical volume as the logical volumes that back the DRBD devices
> mentioned above.
This looks like a complicated setup. Is this because you cannot run the
entire machine from a drbd disk? Is this intended for failover?


