Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
> > I have about 40 drbd devices per node (primary and secondaries). Our provider > > has lot of network issues, which sometimes cause drbd to disconnect/reconnect > > very often : about 500 NetworkFailure in 1 hour before the last crash : > > # grep "Connected -> NetworkFailure" /var/log/messages|grep -c "Mar 30 00" > > 483 > > So you are using DRBD with ganeti in a cloud? > Which cloud? what do you mean by which cloud ? > The most interessting line is before that. > > > Mar 30 00:52:48 z2-6 kernel: [1685605.588315] CPU 2 > > > Mar 30 00:52:48 z2-6 kernel: [1685605.589086] Pid: 21781, comm: drbd0_worker Tainted: G W 2.6.30-2-amd64 #1 X8STi > > Mar 30 00:52:48 z2-6 kernel: [1685605.594280] RIP: 0010:[<ffffffff802bbc80>] [<ffffffff802bbc80>] cache_alloc_refill+0xf6/0x1f9 > > Hard out of memory? > did you google for "2.6.30 cache_alloc_refill", > and checked that you are not affected by any of those? Yep, but there is not lot of things. We may suppose that, because of the lot of NetworkFailure / Reconnection, the system do not flush memory fast enough so that, when the network/drbd driver asks for memory, it fails, and the driver deactivates itself (especially if we are in some special context, like IRQ) ? Maxence -- Maxence DUNNEWIND Contact : maxence at dunnewind.net Site : http://www.dunnewind.net GPG : 18AE 61E4 D0B0 1C7C AAC9 E40D 4D39 68DB 0D2E B533 -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 197 bytes Desc: Digital signature URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20100401/416a7d53/attachment.pgp>