Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
On Mon, Mar 19, 2012 at 05:04:44PM +0900, Christian Balzer wrote: > > Hi Florian, > > On Fri, 16 Mar 2012 13:55:17 +0100 Florian Haas wrote: > > > On Wed, Mar 14, 2012 at 7:48 AM, Christian Balzer <chibi at gol.com> wrote: > > > Hello, > > > > > > This is basically a repeat of: > > > http://lists.linbit.com/pipermail/drbd-user/2011-August/016758.html > > > > > > 32GB RAM, Debian Squeeze, 3.2 (debian backport) kernel, 8.3.12 DRBD, > > > IPOIB in connected mode with a 64k MTU. Just 2 DRBD resources. > > > > > > After encountering this for the first time (never showed up in two > > > weeks of stress testing, which only goes to prove that real life just > > > can't be simulated) I found the above article and changed the > > > following sysctls: > > > > > > vm/min_free_kbytes = 262144 > [snip] > > > > > > Lars hinted at "atomic reserves" in his reply, which particular > > > parameters are we talking about here? > > > > I had hoped for Lars to pitch in here, but I guess I'll give it a go > > instead. Note I'm certainly no kernel memory management expert, but > > I'm not aware of anything that would fit that description other than > > the vm.min_free_kbytes sysctl you've already mentioned. > > > Yeah, that was my assumption, too. Well, no. Or rather, "it depends". The trace you posted contains tcp_sendmsg, so from the send path. In the *receive* path, the min_free_kbytes actually make a difference. In the *send* path, typically it does not, because we are not in "atomic" context, but may block/sleep, and thus this reserve should normally not be touched. Also, the problem is not insufficient free memory, but insufficient free memory of the desired "order". Put it differently: problem is memory fragmentation. So you need to look into memory "defragmentation", which is better known as "memory compaction" in the linux kernel. Relevant sysctls: compact_memory (trigger to do an ad-hoc compaction run) extfrag_threshold, probably a few more. Or you need to fix the drivers to not require higher order page allocation, but be ok with just some single pages scattered around. > > SUSE's kernel documentation team, btw, lists these "page allocation > > failure" warnings as no cause for concern as long as they happen > > infrequently: > > > Once or twice per day would fit that bill, however they still make me > wonder. > I doubled the vm.min_free_kbytes again to 512MB and still got them at > times with particular high activity. Not sure if upping to 1GB would > actually make it go away, as reported free memory was several GB at least > once when such a failure was logged. > > I guess I'll just keep an eye on it, these boxes are at about 30% of their > expected load/capacity (I/O, not space) now... -- : Lars Ellenberg : LINBIT | Your Way to High Availability : DRBD/HA support and consulting http://www.linbit.com DRBD® and LINBIT® are registered trademarks of LINBIT, Austria. __ please don't Cc me, but send to list -- I'm subscribed