[DRBD-user] drbd sync killing machine with OOM

Dmitry S. Makovey dmitry at athabascau.ca
Fri May 4 17:34:50 CEST 2007

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.

On Thursday 03 May 2007 13:11, Rene Mayrhofer wrote:
> Am Donnerstag, 3. Mai 2007 19:52:00 schrieb Rene Mayrhofer:
> > The critical problem that's caused our servers to go offline for a few
> > times in the past 2 days is that when too many of the drbd volumes are
> > syncing at the same time, both nodes fall into OOM hell and reboot after
> > some time, repeating the cycle all over again.
> Update: it just happened again, but this time without any sync being
> performed at all (the sync network between the nodes is physically down at
> the moment to prevent the issue...). The current master node just went down
> with OOM errors, but it had 95% of its 3GB swap partition free at any time.
> Therefore I have to assume that non-swappable memory was consumed until
> none was left. The XEN dom0 instance where drbd is configured currently has
> 384MB RAM assigned, with basically no additional services running (sshd and
> a slave slapd).
> How much is drbd supposed to consume per GB and per volume?

without pretending to be an expert on the subject I'll offer so things I'd 
consider. feel free to take advise or not or correct me if I'm wrong.

1. your setup seems "strange" in a way that it seems you're doing 

> /dev/mdX --> LVM2 volumes --> drbd volume for each LV --> XEN domU

and what I'd do would be /dev/mdX -> drbd -> Xen domU + LVM2 this way you end 
up with only several drbdX (or maybe just one).

2. you're assuming that Xen migration is going to work with DRBD but from my 
understanding Xen needs both systems to have their storage to be attached at 
the same time for some period of time (time of migration). With DRBD one 
system should stay offline, unless you're ready to run out of sync between 
two systems.

Those were probably not the root cause of your trouble but simplifying setup 
could help to get around the problem ;)

somebody more knowledgeable on subject please correct me if I'm wrong.

Dmitry Makovey
Web Systems Administrator
Athabasca University
(780) 675-6245
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20070504/6ee64ebc/attachment.pgp>

More information about the drbd-user mailing list