[DRBD-user] drbdadm verify stalled

Matthieu Lejeune matthieu.lejeune at exxoss.com
Tue Dec 3 09:39:58 CET 2013

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Hello

I have a problem when i'm launching a drbdadm verify on my ressources

This is my config :

root at ifprdstor6b:~# cat /etc/drbd.d/DSA601.res
resource DSA601 {
   protocol C;

   startup {
     wfc-timeout 0;
   }

   disk {
     on-io-error detach;
   }

   syncer {
     rate 400M;
     verify-alg md5;
   }

   on ifprdstor6a {
     device    /dev/drbd1;
     disk      /dev/sda;
     address   10.13.1.1:7788;
     meta-disk internal;
   }

   on ifprdstor6b {
     device    /dev/drbd1;
     disk      /dev/sda;
     address   10.13.1.2:7788;
     meta-disk internal;
   }
}

I have 8 ressources with the same configuration

I'm using Drbd over IB network and this is my network config :

root at ifprdstor6b:~# cat /etc/network/interfaces
# This file describes the network interfaces available on your system
# and how to activate them. For more information, see interfaces(5).

# The loopback network interface
auto lo
iface lo inet loopback

# The primary network interface
allow-hotplug eth0
iface eth0 inet static
     address 10.12.20.2
     netmask 255.255.255.0
     network 10.12.20.0
     broadcast 10.1.2.255
     gateway 10.12.20.254
     # dns-* options are implemented by the resolvconf package, if installed
     dns-nameservers 10.1.2.33 10.1.2.34
     dns-search lampiris.local

allow-hotplug ib0
iface ib0 inet static
   address 10.13.1.2
   netmask 255.255.255.0


Now when I start a verify with : drbdadm verify DSA601, it's OK and the 
verify complete.

But when I start the verify on the second ressouce with : drbdadm verify 
DSA602 it's blocking :

This is the : /proc/drbd

root at ifprdstor6a:~# cat /proc/drbd
version: 8.3.11 (api:88/proto:86-96)
srcversion: F937DCB2E5D83C6CCE4A6C9

  1: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r-----
     ns:0 nr:0 dw:0 dr:88 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
  2: cs:VerifyS ro:Primary/Secondary ds:UpToDate/UpToDate C r-----
     ns:0 nr:0 dw:0 dr:82296 al:0 bm:0 lo:1 pe:8706 ua:2048 ap:0 ep:1 
wo:f oos:0
     [>....................] verified:  0.1% (1713280/1713352)Mfinish: 
386277:34:31 speed: 0 (0) want: 409,600 K/sec (stalled)
  3: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r-----
     ns:0 nr:0 dw:0 dr:88 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
  4: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r-----
     ns:0 nr:0 dw:0 dr:88 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
  5: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r-----
     ns:0 nr:0 dw:0 dr:88 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
  6: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r-----
     ns:0 nr:0 dw:0 dr:88 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
  7: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r-----
     ns:0 nr:0 dw:0 dr:88 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
  8: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r-----
     ns:0 nr:0 dw:0 dr:88 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0


The ressource 2 is freezed, I can't access to the volume and make an LVM 
LV or something else. But I can use the other ressources.

I have this entry on the syslog :

Dec  3 09:39:23 ifprdstor6a kernel: [61078.583849] block drbd2: 
[drbd2_worker/3080] sock_sendmsg time expired, ko = 4294957143
Dec  3 09:39:29 ifprdstor6a kernel: [61084.574556] block drbd2: 
[drbd2_worker/3080] sock_sendmsg time expired, ko = 4294957142
Dec  3 09:39:31 ifprdstor6a OpenSM[2668]: SM port is down#012

Someone had this problem ? Any solutions

Thank's in advance

Matthieu Lejeune










More information about the drbd-user mailing list