Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hi,
in the past (3 years ago AFAIR) I experienced oom-killer problems with too small configurations of memory in Xen Dom0. This can cause an unreachable node, if ssh is oom-killed.
Increasing the memory to 2 GB was enough for 12 guests, each 2 virtual devices, i.e. 24 drbd-devices.
On a new cluster with 64 GB per node I naively configured the same 2 GB memory for Dom0.
Migrating drbd-devices and VMs to the new cluster it now seems to touch the limits as it begins to swap a little bit.
I still need to migrate 3 VMs (6 drbd-devices) to the new cluster.
Will it work without problems, without downtime for repairing the misconfiguration?
And of course all VMs should work on one node after failover.
Summary:
root at xen12:~# cat /proc/drbd | grep version
version: 8.3.11 (api:88/proto:86-96)
srcversion: F937DCB2E5D83C6CCE4A6C9
root at xen12:~# cat /proc/drbd | grep Primary
1: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r-----
2: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r-----
3: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r-----
4: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r-----
15: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r-----
16: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r-----
17: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r-----
18: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r-----
19: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r-----
20: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r-----
23: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r-----
24: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r-----
25: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r-----
26: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r-----
27: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r-----
28: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r-----
29: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r-----
30: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r-----
31: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r-----
32: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r-----
33: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r-----
34: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r-----
root at xen12:~# cat /proc/drbd | grep Primary | wc -l
22
root at xen12:~# lvdisplay -C --units g
LV VG Attr LSize Pool Origin Data% Move Log Copy% Convert
lv1 vg1 -wi-ao-- 46.56g
lv2 vg1 -wi-ao-- 3.72g
lv_drbd10_1 vg1 -wi-ao-- 100.00g
lv_drbd10_2 vg1 -wi-ao-- 2.00g
lv_drbd12_1 vg1 -wi-ao-- 100.00g
lv_drbd12_2 vg1 -wi-ao-- 2.00g
lv_drbd13_1 vg1 -wi-ao-- 100.00g
lv_drbd13_2 vg1 -wi-ao-- 2.00g
lv_drbd14_1 vg1 -wi-ao-- 100.00g
lv_drbd14_2 vg1 -wi-ao-- 2.00g
lv_drbd15_1 vg1 -wi-ao-- 100.00g
lv_drbd15_2 vg1 -wi-ao-- 2.00g
lv_drbd16_1 vg1 -wi-ao-- 100.00g
lv_drbd16_2 vg1 -wi-ao-- 2.00g
lv_drbd17_1 vg1 -wi-ao-- 50.00g
lv_drbd17_2 vg1 -wi-ao-- 2.00g
lv_drbd1_1 vg1 -wi-ao-- 150.00g
lv_drbd1_2 vg1 -wi-ao-- 2.00g
lv_drbd2_1 vg1 -wi-ao-- 100.00g
lv_drbd2_2 vg1 -wi-ao-- 2.00g
lv_drbd8_1 vg1 -wi-ao-- 100.00g
lv_drbd8_2 vg1 -wi-ao-- 2.00g
lv_drbd9_1 vg1 -wi-ao-- 100.00g
lv_drbd9_2 vg1 -wi-ao-- 2.00g
root at xen12:~# free
total used free shared buffers cached
Mem: 1599544 1490852 108692 0 133396 421688
-/+ buffers/cache: 935768 663776
Swap: 3903484 11068 3892416
root at xen12:~# cat /proc/meminfo
MemTotal: 1599544 kB
MemFree: 90068 kB
Buffers: 133044 kB
Cached: 420476 kB
SwapCached: 336 kB
Active: 769764 kB
Inactive: 354888 kB
Active(anon): 477700 kB
Inactive(anon): 101496 kB
Active(file): 292064 kB
Inactive(file): 253392 kB
Unevictable: 72 kB
Mlocked: 72 kB
SwapTotal: 3903484 kB
SwapFree: 3892396 kB
Dirty: 36 kB
Writeback: 0 kB
AnonPages: 569652 kB
Mapped: 23380 kB
Shmem: 8064 kB
Slab: 208304 kB
SReclaimable: 57244 kB
SUnreclaim: 151060 kB
KernelStack: 4048 kB
PageTables: 7604 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 4703256 kB
Committed_AS: 626304 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 339164 kB
VmallocChunk: 34359395120 kB
HardwareCorrupted: 0 kB
AnonHugePages: 0 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
DirectMap4k: 66659472 kB
DirectMap2M: 0 kB
root at xen12:~# xm info
host : xen12
release : 3.2.0-4-amd64
version : #1 SMP Debian 3.2.60-1+deb7u3
machine : x86_64
nr_cpus : 24
nr_nodes : 2
cores_per_socket : 6
threads_per_core : 2
cpu_mhz : 2200
hw_caps : bfebfbff:2c100800:00000000:00007f40:73bee3ff:00000000:00000001:00000281
virt_caps : hvm hvm_directio
total_memory : 65490
free_memory : 46211
free_cpus : 0
xen_major : 4
xen_minor : 1
xen_extra : .4
xen_caps : xen-3.0-x86_64 xen-3.0-x86_32p hvm-3.0-x86_32 hvm-3.0-x86_32p hvm-3.0-x86_64
xen_scheduler : credit
xen_pagesize : 4096
platform_params : virt_start=0xffff800000000000
xen_changeset : unavailable
xen_commandline : placeholder dom0_mem=2048M,max:20148M
cc_compiler : gcc version 4.7.2 (Debian 4.7.2-5)
cc_compile_by : ultrotter
cc_compile_domain : debian.org
cc_compile_date : Sun Aug 17 10:54:25 EEST 2014
xend_config_format : 4
root at xen12:~# xm list
[names removed]
ID Mem VCPUs State Time(s)
0 2047 24 r----- 3570574.2
20 6144 2 -b---- 25020.3
21 2048 2 -b---- 12964.0
9 2048 2 -b---- 650630.6
19 1024 1 -b---- 2706.2
23 3072 2 -b---- 8947.8
22 2048 1 -b---- 4762.4
root at xen13:~# xm list
[names removed]
ID Mem VCPUs State Time(s)
0 2047 24 r----- 2191892.7
5 3072 2 -b---- 16419.9
4 5120 2 -b---- 79276.0
6 2048 1 -b---- 1826.5
3 2048 1 -b---- 282876.7
2 4096 2 -b---- 401536.1
Still to migrate from old cluster:
root at xen10:~# cat /proc/drbd | grep Primary
7: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r----
8: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r----
9: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r----
10: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r----
21: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r----
22: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r----
TIA
Helmut Wollmersdorfer