[DRBD-user] secondary fails to connect with primary

Tom Brown brown at esteem.com
Tue Jan 16 17:43:27 CET 2007

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Hi,

I have drbd and heartbeat installed on two nodes: foo and bar. When I bring up 
the primary node (foo) drbd starts up and then hearbeat starts up and foo 
goes into primary mode and heartbeat mounts the filesystems.

foo:~# cat /proc/drbd
version: 8.0rc1 (api:86/proto:85)
SVN Revision: 2644 build by tbrown at foo, 2007-01-11 09:03:54
 0: cs:StandAlone st:Primary/Unknown ds:UpToDate/DUnknown   r---
    ns:0 nr:0 dw:104 dr:273 al:0 bm:0 lo:0 pe:0 ua:0 ap:0
        resync: used:0/31 hits:0 misses:0 starving:0 dirty:0 changed:0
        act_log: used:0/257 hits:26 misses:0 starving:0 dirty:0 changed:0
 1: cs:StandAlone st:Primary/Unknown ds:UpToDate/DUnknown   r---
    ns:0 nr:0 dw:40 dr:245 al:0 bm:0 lo:0 pe:0 ua:0 ap:0
        resync: used:0/31 hits:0 misses:0 starving:0 dirty:0 changed:0
        act_log: used:0/257 hits:10 misses:0 starving:0 dirty:0 changed:0
 2: cs:StandAlone st:Primary/Unknown ds:UpToDate/DUnknown   r---
    ns:0 nr:0 dw:40 dr:309 al:0 bm:0 lo:0 pe:0 ua:0 ap:0
        resync: used:0/31 hits:0 misses:0 starving:0 dirty:0 changed:0
        act_log: used:0/257 hits:10 misses:0 starving:0 dirty:0 changed:0


However, when I bring up the secondary node (bar), drbd will not connect to 
foo. I've given the output of starting drbd below. There is also the output 
of 'cat /proc/drbd' along with the drbd.conf. I'm guessing my configuration 
is incorrect. Any ideas where it is wrong?

Thanks,
Tom

PS: I added part of the /var/log/syslog from bar at the end.

bar:~# /etc/init.d/drbd start
Starting DRBD resources:    [ d0 d1 d2 s0 s1 s2 n0 n1 n2 ].
..........
***************************************************************
 DRBD's startup script waits for the peer node(s) to appear.
 - In case this node was already a degraded cluster before the
   reboot the timeout is 120 seconds. [degr-wfc-timeout]
 - If the peer was available before the reboot the timeout will
   expire after 30 seconds. [wfc-timeout]
   (These values are for resource 'r0'; 0 sec -> wait forever)
 To abort waiting enter 'yes' [  30]:


bar:~# cat /proc/drbd
version: 8.0rc1 (api:86/proto:85)
SVN Revision: 2644 build by tbrown at bar, 2007-01-11 09:04:23
 0: cs:WFConnection st:Secondary/Unknown ds:UpToDate/DUnknown C r---
    ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0
        resync: used:0/31 hits:0 misses:0 starving:0 dirty:0 changed:0
        act_log: used:0/257 hits:0 misses:0 starving:0 dirty:0 changed:0
 1: cs:WFConnection st:Secondary/Unknown ds:UpToDate/DUnknown C r---
    ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0
        resync: used:0/31 hits:0 misses:0 starving:0 dirty:0 changed:0
        act_log: used:0/257 hits:0 misses:0 starving:0 dirty:0 changed:0
 2: cs:WFConnection st:Secondary/Unknown ds:UpToDate/DUnknown C r---
    ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0
        resync: used:0/31 hits:0 misses:0 starving:0 dirty:0 changed:0
        act_log: used:0/257 hits:0 misses:0 starving:0 dirty:0 changed:0


bar:~# cat /etc/drbd.conf
global {
    usage-count yes;
}

common {
  syncer { rate 68M; }
}

resource r0 {
  protocol C;
  handlers {
    pri-on-incon-degr "echo O > /proc/sysrq-trigger ; halt -f";
    pri-lost-after-sb "echo O > /proc/sysrq-trigger ; halt -f";
    local-io-error "echo O > /proc/sysrq-trigger ; halt -f";
    outdate-peer "/usr/sbin/drbd-peer-outdater";
  }
  startup {
    wfc-timeout  30;
    degr-wfc-timeout 120;    # 2 minutes.
  }
  disk {
    on-io-error   detach;
  }
  net {
    after-sb-0pri disconnect;
    after-sb-1pri disconnect;
    after-sb-2pri disconnect;
    rr-conflict disconnect;
  }
  syncer {
    al-extents 257;
  }
  on foo {
    device     /dev/drbd0;
    disk       /dev/hdb5;
    address    192.168.1.14:7788;
    meta-disk  /dev/hdb1 [0];
  }
  on bar {
    device    /dev/drbd0;
    disk      /dev/sda8;
    address   192.168.1.15:7788;
    meta-disk /dev/sda5 [0];
  }
}

resource r1 {
  protocol C;
  handlers {
    pri-on-incon-degr "echo O > /proc/sysrq-trigger ; halt -f";
    pri-lost-after-sb "echo O > /proc/sysrq-trigger ; halt -f";
    local-io-error "echo O > /proc/sysrq-trigger ; halt -f";
    outdate-peer "/usr/sbin/drbd-peer-outdater";
  }
  startup {
    wfc-timeout  30;
    degr-wfc-timeout 120;    # 2 minutes.
  }
  disk {
    on-io-error   detach;
  }
  net {
    after-sb-0pri disconnect;
    after-sb-1pri disconnect;
    after-sb-2pri disconnect;
    rr-conflict disconnect;
  }
  syncer {
    after "r0";
    al-extents 257;
  }
  on foo {
    device     /dev/drbd1;
    disk       /dev/hdb6;
    address    192.168.1.14:7789;
    meta-disk  /dev/hdb2 [0];
  }
  on bar {
    device    /dev/drbd1;
    disk      /dev/sda9;
    address   192.168.1.15:7789;
    meta-disk /dev/sda6 [0];
  }
}

resource r2 {
  protocol C;
  handlers {
    pri-on-incon-degr "echo O > /proc/sysrq-trigger ; halt -f";
    pri-lost-after-sb "echo O > /proc/sysrq-trigger ; halt -f";
    local-io-error "echo O > /proc/sysrq-trigger ; halt -f";
    outdate-peer "/usr/sbin/drbd-peer-outdater";
  }
  startup {
    wfc-timeout  30;
    degr-wfc-timeout 120;    # 2 minutes.
  }
  disk {
    on-io-error   detach;
  }
  net {
    after-sb-0pri disconnect;
    after-sb-1pri disconnect;
    after-sb-2pri disconnect;
    rr-conflict disconnect;
  }
  syncer {
    after "r1";
    al-extents 257;
  }
  on foo {
    device     /dev/drbd2;
    disk       /dev/hdb7;
    address    192.168.1.14:7790;
    meta-disk  /dev/hdb3 [0];
  }
  on bar {
    device    /dev/drbd2;
    disk      /dev/sda10;
    address   192.168.1.15:7790;
    meta-disk /dev/sda7 [0];
  }
}

Relavant part of /var/log/syslog:
Jan 15 15:11:34 bar kernel: drbd: initialised. Version: 8.0rc1 
(api:86/proto:85)
Jan 15 15:11:34 bar kernel: drbd: SVN Revision: 2644 build by tbrown at bar, 
2007-01-11 09:04:23
Jan 15 15:11:34 bar kernel: drbd: registered as block device major 147
Jan 15 15:11:34 bar kernel: drbd: minor_table @ 0xde694c60
Jan 15 15:11:34 bar kernel: drbd0: disk( Diskless -> Attaching )
Jan 15 15:11:34 bar kernel: klogd 1.4.1, ---------- state change ----------
Jan 15 15:11:34 bar kernel: drbd0: No usable activity log found.
Jan 15 15:11:34 bar kernel: drbd0: max_segment_size ( = BIO size ) = 32768
Jan 15 15:11:34 bar kernel: drbd0: drbd_bm_resize called with capacity == 
14681457
Jan 15 15:11:34 bar kernel: drbd0: resync bitmap: bits=1835183 words=57350
Jan 15 15:11:34 bar kernel: drbd0: size = 7168 MB (7340728 KB)
Jan 15 15:11:34 bar kernel: drbd0: reading of bitmap took 6 jiffies
Jan 15 15:11:34 bar kernel: drbd0: recounting of set bits took additional 0 
jiffies
Jan 15 15:11:34 bar kernel: drbd0: 0 KB marked out-of-sync by on disk bit-map.
Jan 15 15:11:34 bar kernel: drbd0: disk( Attaching -> UpToDate )
Jan 15 15:11:34 bar kernel: drbd0: Writing meta data super block now.
Jan 15 15:11:34 bar kernel: drbd1: disk( Diskless -> Attaching )
Jan 15 15:11:34 bar kernel: drbd1: No usable activity log found.
Jan 15 15:11:34 bar kernel: drbd1: max_segment_size ( = BIO size ) = 32768
Jan 15 15:11:34 bar kernel: drbd1: drbd_bm_resize called with capacity == 
14680449
Jan 15 15:11:34 bar kernel: drbd1: resync bitmap: bits=1835057 words=57346
Jan 15 15:11:34 bar kernel: drbd1: size = 7168 MB (7340224 KB)
Jan 15 15:11:34 bar kernel: drbd1: reading of bitmap took 4 jiffies
Jan 15 15:11:34 bar kernel: drbd1: recounting of set bits took additional 1 
jiffies
Jan 15 15:11:34 bar kernel: drbd1: 0 KB marked out-of-sync by on disk bit-map.
Jan 15 15:11:34 bar kernel: drbd1: disk( Attaching -> UpToDate )
Jan 15 15:11:34 bar kernel: drbd1: Writing meta data super block now.
Jan 15 15:11:34 bar kernel: drbd2: disk( Diskless -> Attaching )
Jan 15 15:11:34 bar kernel: drbd2: No usable activity log found.
Jan 15 15:11:34 bar kernel: drbd2: max_segment_size ( = BIO size ) = 32768
Jan 15 15:11:34 bar kernel: drbd2: drbd_bm_resize called with capacity == 
18874737
Jan 15 15:11:34 bar kernel: drbd2: resync bitmap: bits=2359343 words=73730
Jan 15 15:11:34 bar kernel: drbd2: size = 9216 MB (9437368 KB)
Jan 15 15:11:34 bar kernel: drbd2: reading of bitmap took 2 jiffies
Jan 15 15:11:34 bar kernel: drbd2: recounting of set bits took additional 0 
jiffies
Jan 15 15:11:34 bar kernel: drbd2: 0 KB marked out-of-sync by on disk bit-map.
Jan 15 15:11:34 bar kernel: drbd2: disk( Attaching -> UpToDate )
Jan 15 15:11:34 bar kernel: drbd2: Writing meta data super block now.
Jan 15 15:11:34 bar kernel: drbd0: conn( StandAlone -> Unconnected )
Jan 15 15:11:34 bar kernel: drbd0: receiver (re)started
Jan 15 15:11:34 bar kernel: drbd0: conn( Unconnected -> WFConnection )
Jan 15 15:11:34 bar kernel: drbd1: conn( StandAlone -> Unconnected )
Jan 15 15:11:34 bar kernel: drbd1: receiver (re)started
Jan 15 15:11:34 bar kernel: drbd1: conn( Unconnected -> WFConnection )
Jan 15 15:11:34 bar kernel: drbd2: conn( StandAlone -> Unconnected )
Jan 15 15:11:34 bar kernel: drbd2: receiver (re)started
Jan 15 15:11:34 bar kernel: drbd2: conn( Unconnected -> WFConnection )





More information about the drbd-user mailing list