Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
On 08/05/2015 11:34 AM, Helmut Wollmersdorfer wrote: > Hi, > > in the past (3 years ago AFAIR) I experienced oom-killer problems with too small configurations of memory in Xen Dom0. This can cause an unreachable node, if ssh is oom-killed. These are multiple problems. One is running out of memory, which can happen for various reasons, another one is that the system is misconfigured to allow random processes to be killed in the case that the system runs out of memory. Linux's default configuration overcommits memory, which means that it fulfills application requests for memory even when all the memory is already reserved for other processes, based on the assumption that most processes will probably not use all the memory they could theoretically need (e.g., if every process got a copy of all its copy-on-write memory pages, etc.) Sometimes this assumption works out, sometimes it doesn't, and that's when the oom-killer starts to kill random processes. Quite obviously, from an availability point of view, it would be better to reconfigure the Linux kernel so that it denies application requests for memory as soon as it can no longer guarantee that there is enough free memory even for the case that all processes actually use as much memory as they could theoretically. You can do that by setting vm.overcommit_memory=2 vm.overcommit_ratio=n ...where n is some rather high percentage of random access memory that will be made availabe to applications; probably something in the range of 90 to 99, depending on the hard- and software configuration. This WILL use more memory, but it will also improve the system's robustness regarding memory shortage situations. For this feature to be useful, swap space must be configured (so that Linux can still grant more than the RAM's size to applications, simply by reserving the required amount of swap space). > Migrating drbd-devices and VMs to the new cluster it now seems to touch the limits as it begins to swap a little bit. > > I still need to migrate 3 VMs (6 drbd-devices) to the new cluster. > > Will it work without problems, without downtime for repairing the misconfiguration? I guess you will have to restart the dom0 after assigning more memory to it. I am not sure whether you can add memory to it on-the-fly. Regarding the memory management settings mentioned above, those can be changed on the fly, provided that enough memory is free at the time of the change, and provided the change is made in the right order. > Re: [DRBD-user] How much memory does a drbd-device need? Since the bitmap is always kept in memory while the resource is online, every resource requires somewhat more than the bitmap's size in memory. The bitmap's size is approximately 32 kilobytes of bitmap data per gigabyte of replicated storage (= 32 megabytes per terabyte). That, and then whatever the buffer sizes for the resource are (as configured in its configuration file), plus some internal datastructures (but that is a small factor compared to the others). > TIA > > Helmut Wollmersdorfer > best regards, Robert