[DRBD-user] Testing new DRBD9 dedicated repo for PVE

Michele Rossetti rossetti at sardi.it
Fri Jan 13 19:58:06 CET 2017

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


After upgrade, dist upgrade to PVE 4.4 and drbdmanage-proxmox 
install, the KVM VMs don't start anymore at boot, they start only if 
are on primary node, and when started don't migrate for priaery node 
to secondary, even in HA.

The error messages from PVE are:

kvm: -drive 
file=/dev/drbd/by-res/vm-104-disk-1/0,if=none,id=drive-scsi0,format=raw,cache=none,aio=native,detect-zeroes=on: 
Could not open '/dev/drbd/by-res/vm-104-disk-1/0': No such file or 
directory
TASK ERROR: start failed: command '/usr/bin/kvm -id 104 -chardev 
'socket,id=qmp,path=/var/run/qemu-server/104.qmp,server,nowait' -mon 
'chardev=qmp,mode=control' -pidfile /var/run/qemu-server/104.pid 
-daemonize -smbios 'type=1,uuid=72d4bc28-b877-413a-9750-e7bf97938abb' 
-name php4i386 -smp '1,sockets=1,cores=1,maxcpus=1' -nodefaults -boot 
'menu=on,strict=on,reboot-timeout=1000,splash=/usr/share/qemu-server/bootsplash.jpg' 
-vga cirrus -vnc unix:/var/run/qemu-server/104.vnc,x509,password -cpu 
kvm64,+lahf_lm,+sep,+kvm_pv_unhalt,+kvm_pv_eoi,enforce -m 2048 -k it 
-device 'pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f' 
-device 'pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e' 
-device 'piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2' -device 
'usb-tablet,id=tablet,bus=uhci.0,port=1' -device 
'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3' -iscsi 
'initiator-name=iqn.1993-08.org.debian:01:af80fcb2976' -drive 
'if=none,id=drive-ide2,media=cdrom,aio=threads' -device 
'ide-cd,bus=ide.1,unit=0,drive=drive-ide2,id=ide2,bootindex=200' 
-device 'virtio-scsi-pci,id=scsihw0,bus=pci.0,addr=0x5' -drive 
'file=/dev/drbd/by-res/vm-104-disk-1/0,if=none,id=drive-scsi0,format=raw,cache=none,aio=native,detect-zeroes=on' 
-device 
'scsi-hd,bus=scsihw0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0,id=scsi0,bootindex=100' 
-netdev 
'type=tap,id=net0,ifname=tap104i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown,vhost=on' 
-device 
'virtio-net-pci,mac=6E:23:50:8A:35:50,netdev=net0,bus=pci.0,addr=0x12,id=net0,bootindex=300' 
-machine 'type=pc-i440fx-2.7' -incoming 
unix:/run/qemu-server/104.migrate -S' failed: exit code 1

and trying to migrate a started VM in HA:

task started by HA resource agent
Jan 13 19:13:47 starting migration of VM 104 to node 'mpve1' (82.xx.xx.xx)
Jan 13 19:13:47 copying disk images
Jan 13 19:13:47 starting VM 104 on remote node 'mpve1'
Jan 13 19:13:50 start failed: command '/usr/bin/kvm -id 104 -chardev 
'socket,id=qmp,path=/var/run/qemu-server/104.qmp,server,nowait' -mon 
'chardev=qmp,mode=control' -pidfile /var/run/qemu-server/104.pid 
-daemonize -smbios 'type=1,uuid=72d4bc28-b877-413a-9750-e7bf97938abb' 
-name php4i386 -smp '1,sockets=1,cores=1,maxcpus=1' -nodefaults -boot 
'menu=on,strict=on,reboot-timeout=1000,splash=/usr/share/qemu-server/bootsplash.jpg' 
-vga cirrus -vnc unix:/var/run/qemu-server/104.vnc,x509,password -cpu 
kvm64,+lahf_lm,+sep,+kvm_pv_unhalt,+kvm_pv_eoi,enforce -m 2048 -k it 
-device 'pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f' 
-device 'pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e' 
-device 'piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2' -device 
'usb-tablet,id=tablet,bus=uhci.0,port=1' -device 
'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3' -iscsi 
'initiator-name=iqn.1993-08.org.debian:01:af80fcb2976' -drive 
'if=none,id=drive-ide2,media=cdrom,aio=threads' -device 
'ide-cd,bus=ide.1,unit=0,drive=drive-ide2,id=ide2,bootindex=200' 
-device 'virtio-scsi-pci,id=scsihw0,bus=pci.0,addr=0x5' -drive 
'file=/dev/drbd/by-res/vm-104-disk-1/0,if=none,id=drive-scsi0,format=raw,cache=none,aio=native,detect-zeroes=on' 
-device 
'scsi-hd,bus=scsihw0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0,id=scsi0,bootindex=100' 
-netdev 
'type=tap,id=net0,ifname=tap104i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown,vhost=on' 
-device 
'virtio-net-pci,mac=6E:23:50:8A:35:50,netdev=net0,bus=pci.0,addr=0x12,id=net0,bootindex=300' 
-machine 'type=pc-i440fx-2.7' -incoming 
unix:/run/qemu-server/104.migrate -S' failed: exit code 1
Jan 13 19:13:50 ERROR: online migrate failure - command '/usr/bin/ssh 
-o 'BatchMode=yes' root at 82.xx.xx.xx qm start 104 --skiplock 
--migratedfrom mpve3 --migration_type secure --stateuri unix 
--machine pc-i440fx-2.7' failed: exit code 255
Jan 13 19:13:50 aborting phase 2 - cleanup resources
Jan 13 19:13:50 migrate_cancel
Jan 13 19:13:51 ERROR: migration finished with problems (duration 00:00:04)
TASK ERROR: migration problems

It's true that /dev/drbd/by-res/vm-104-disk-1/0': No such file or directory

Trying to locate vm-104-disk this is the output:

root at mpve1:/dev/drbd/by-disk/drbdpool# locate vm-104-disk
/dev/drbdpool/vm-104-disk-1_00
/var/lib/drbd.d/drbdmanage_vm-104-disk-1.res.q

Cheching DRBD all seem ok.

root at mpve1:~# drbd-overview
  0:.drbdctrl/0  Connected(3*) Seco(mpve2,mpve1)/Prim(mpve3) 
UpTo(mpve1)/UpTo(mpve3,mpve2)
  1:.drbdctrl/1  Connected(3*) Seco(mpve2,mpve1)/Prim(mpve3) 
UpTo(mpve1)/UpTo(mpve3,mpve2)

  root at mpve1:~# drbdsetup status
.drbdctrl role:Secondary
   volume:0 disk:UpToDate
   volume:1 disk:UpToDate
   mpve2 role:Secondary
     volume:0 peer-disk:UpToDate
     volume:1 peer-disk:UpToDate
   mpve3 role:Primary
     volume:0 peer-disk:UpToDate
     volume:1 peer-disk:UpToDate


root at mpve1:~# drbdmanage list-nodes
+------------------------------------------------------------------------------+
| Name  | Pool Size | Pool Free | 
| State |
|------------------------------------------------------------------------------|
| mpve1 |   7612664 |   7297499 | 
|    ok |
| mpve2 |   7612664 |   7252584 | 
|    ok |
| mpve3 |   7612664 |   7252584 | 
|    ok |
+------------------------------------------------------------------------------+

root at mpve1:~# drbdmanage list-assignments
+------------------------------------------------------------------------------+
| Node  | Resource      | Vol ID | 
| State |
|------------------------------------------------------------------------------|
| mpve1 | vm-105-disk-1 |      * | 
|    ok |
| mpve1 | vm-104-disk-1 |      * | 
|    ok |
| mpve1 | vm-103-disk-1 |      * | 
|    ok |
| mpve1 | vm-102-disk-1 |      * | 
|    ok |
| mpve1 | vm-101-disk-1 |      * | 
|    ok |
| mpve1 | vm-100-disk-1 |      * | 
|    ok |
| mpve2 | vm-105-disk-1 |      * | 
|    ok |
| mpve2 | vm-104-disk-1 |      * | 
|    ok |
| mpve2 | vm-103-disk-1 |      * | 
|    ok |
| mpve2 | vm-102-disk-1 |      * | 
|    ok |
| mpve2 | vm-101-disk-1 |      * | 
|    ok |
| mpve2 | vm-100-disk-1 |      * | 
|    ok |
| mpve3 | vm-105-disk-1 |      * | 
|    ok |
| mpve3 | vm-104-disk-1 |      * | 
|    ok |
| mpve3 | vm-103-disk-1 |      * | 
|    ok |
| mpve3 | vm-102-disk-1 |      * | 
|    ok |
| mpve3 | vm-101-disk-1 |      * | 
|    ok |
| mpve3 | vm-100-disk-1 |      * | 
|    ok |
+------------------------------------------------------------------------------+

Any help or suggestions?
Thanks,

Michele


Il 10/01/2017 10:14, Roberto Resoli ha scritto:
>  Il 09/01/2017 19:20, Michele Rossetti ha scritto:
>>  This means that in PVE cluster of 3 servers with DRBD9 updated isn't
>>  possible to restore KVM virtual machines?
>>  Other people on list with the same problem or is only in your
>>  configuration?
>>  Just to know before update ;-)
>  I have just retried today, after having upgraded drbd-utils to
>  8.9.10+linbit-1 , apperead yesterday.
>
>  I have successfuly cycled thru
>
>  vm creation -> vm dump -> vm restore
>
>  on a drbd9 (lvm-thin based) storage with only some quirks I will
>  describe here soon.

I think that quirks were entirely related to creation/deletion of lvm
volumes (backend of drbd ones).

In one case, restore operation resulted in correct creation of new drbd
resource, but on one node the assignment was pending, with "drbdmanage
resume-all" didn't fixing it.

I resolved with a "drbdadm down <vm-resource>" on the problematic node,
removing (lvremove) the backend lvm volume, and reissuing a "drbdmanage
resume-all" that recreated it correctly.

Now I can delete the test vm and restore it without any problem.

So, my advice in case of problems creating/deleting/restoring vms is to
check that creation/deletion of backend lvm volumes is correctly
performed as expected.

rob
-- 
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
MICRO srl
Informatica e Telecomunicazioni - Web services - Web sites
      
       	Michele Rossetti

sede legale: via Raffa Garzia 7   09126 Cagliari (Italy)
sede operativa: viale Marconi 222  09131 Cagliari
Ph. +39 070 400240  Fax +39 070 4526207

MKM-REG
Web:  http://www.microsrl.com     http://www.sardi.it
E-mail: microsrl at microsrl.com
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""



More information about the drbd-user mailing list