Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hi, in the past (3 years ago AFAIR) I experienced oom-killer problems with too small configurations of memory in Xen Dom0. This can cause an unreachable node, if ssh is oom-killed. Increasing the memory to 2 GB was enough for 12 guests, each 2 virtual devices, i.e. 24 drbd-devices. On a new cluster with 64 GB per node I naively configured the same 2 GB memory for Dom0. Migrating drbd-devices and VMs to the new cluster it now seems to touch the limits as it begins to swap a little bit. I still need to migrate 3 VMs (6 drbd-devices) to the new cluster. Will it work without problems, without downtime for repairing the misconfiguration? And of course all VMs should work on one node after failover. Summary: root at xen12:~# cat /proc/drbd | grep version version: 8.3.11 (api:88/proto:86-96) srcversion: F937DCB2E5D83C6CCE4A6C9 root at xen12:~# cat /proc/drbd | grep Primary 1: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r----- 2: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r----- 3: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r----- 4: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r----- 15: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r----- 16: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r----- 17: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r----- 18: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r----- 19: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r----- 20: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r----- 23: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r----- 24: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r----- 25: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r----- 26: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r----- 27: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r----- 28: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r----- 29: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r----- 30: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r----- 31: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r----- 32: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r----- 33: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r----- 34: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r----- root at xen12:~# cat /proc/drbd | grep Primary | wc -l 22 root at xen12:~# lvdisplay -C --units g LV VG Attr LSize Pool Origin Data% Move Log Copy% Convert lv1 vg1 -wi-ao-- 46.56g lv2 vg1 -wi-ao-- 3.72g lv_drbd10_1 vg1 -wi-ao-- 100.00g lv_drbd10_2 vg1 -wi-ao-- 2.00g lv_drbd12_1 vg1 -wi-ao-- 100.00g lv_drbd12_2 vg1 -wi-ao-- 2.00g lv_drbd13_1 vg1 -wi-ao-- 100.00g lv_drbd13_2 vg1 -wi-ao-- 2.00g lv_drbd14_1 vg1 -wi-ao-- 100.00g lv_drbd14_2 vg1 -wi-ao-- 2.00g lv_drbd15_1 vg1 -wi-ao-- 100.00g lv_drbd15_2 vg1 -wi-ao-- 2.00g lv_drbd16_1 vg1 -wi-ao-- 100.00g lv_drbd16_2 vg1 -wi-ao-- 2.00g lv_drbd17_1 vg1 -wi-ao-- 50.00g lv_drbd17_2 vg1 -wi-ao-- 2.00g lv_drbd1_1 vg1 -wi-ao-- 150.00g lv_drbd1_2 vg1 -wi-ao-- 2.00g lv_drbd2_1 vg1 -wi-ao-- 100.00g lv_drbd2_2 vg1 -wi-ao-- 2.00g lv_drbd8_1 vg1 -wi-ao-- 100.00g lv_drbd8_2 vg1 -wi-ao-- 2.00g lv_drbd9_1 vg1 -wi-ao-- 100.00g lv_drbd9_2 vg1 -wi-ao-- 2.00g root at xen12:~# free total used free shared buffers cached Mem: 1599544 1490852 108692 0 133396 421688 -/+ buffers/cache: 935768 663776 Swap: 3903484 11068 3892416 root at xen12:~# cat /proc/meminfo MemTotal: 1599544 kB MemFree: 90068 kB Buffers: 133044 kB Cached: 420476 kB SwapCached: 336 kB Active: 769764 kB Inactive: 354888 kB Active(anon): 477700 kB Inactive(anon): 101496 kB Active(file): 292064 kB Inactive(file): 253392 kB Unevictable: 72 kB Mlocked: 72 kB SwapTotal: 3903484 kB SwapFree: 3892396 kB Dirty: 36 kB Writeback: 0 kB AnonPages: 569652 kB Mapped: 23380 kB Shmem: 8064 kB Slab: 208304 kB SReclaimable: 57244 kB SUnreclaim: 151060 kB KernelStack: 4048 kB PageTables: 7604 kB NFS_Unstable: 0 kB Bounce: 0 kB WritebackTmp: 0 kB CommitLimit: 4703256 kB Committed_AS: 626304 kB VmallocTotal: 34359738367 kB VmallocUsed: 339164 kB VmallocChunk: 34359395120 kB HardwareCorrupted: 0 kB AnonHugePages: 0 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB DirectMap4k: 66659472 kB DirectMap2M: 0 kB root at xen12:~# xm info host : xen12 release : 3.2.0-4-amd64 version : #1 SMP Debian 3.2.60-1+deb7u3 machine : x86_64 nr_cpus : 24 nr_nodes : 2 cores_per_socket : 6 threads_per_core : 2 cpu_mhz : 2200 hw_caps : bfebfbff:2c100800:00000000:00007f40:73bee3ff:00000000:00000001:00000281 virt_caps : hvm hvm_directio total_memory : 65490 free_memory : 46211 free_cpus : 0 xen_major : 4 xen_minor : 1 xen_extra : .4 xen_caps : xen-3.0-x86_64 xen-3.0-x86_32p hvm-3.0-x86_32 hvm-3.0-x86_32p hvm-3.0-x86_64 xen_scheduler : credit xen_pagesize : 4096 platform_params : virt_start=0xffff800000000000 xen_changeset : unavailable xen_commandline : placeholder dom0_mem=2048M,max:20148M cc_compiler : gcc version 4.7.2 (Debian 4.7.2-5) cc_compile_by : ultrotter cc_compile_domain : debian.org cc_compile_date : Sun Aug 17 10:54:25 EEST 2014 xend_config_format : 4 root at xen12:~# xm list [names removed] ID Mem VCPUs State Time(s) 0 2047 24 r----- 3570574.2 20 6144 2 -b---- 25020.3 21 2048 2 -b---- 12964.0 9 2048 2 -b---- 650630.6 19 1024 1 -b---- 2706.2 23 3072 2 -b---- 8947.8 22 2048 1 -b---- 4762.4 root at xen13:~# xm list [names removed] ID Mem VCPUs State Time(s) 0 2047 24 r----- 2191892.7 5 3072 2 -b---- 16419.9 4 5120 2 -b---- 79276.0 6 2048 1 -b---- 1826.5 3 2048 1 -b---- 282876.7 2 4096 2 -b---- 401536.1 Still to migrate from old cluster: root at xen10:~# cat /proc/drbd | grep Primary 7: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r---- 8: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r---- 9: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r---- 10: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r---- 21: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r---- 22: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r---- TIA Helmut Wollmersdorfer