Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hi,
I have drbd and heartbeat installed on two nodes: foo and bar. When I bring up
the primary node (foo) drbd starts up and then hearbeat starts up and foo
goes into primary mode and heartbeat mounts the filesystems.
foo:~# cat /proc/drbd
version: 8.0rc1 (api:86/proto:85)
SVN Revision: 2644 build by tbrown at foo, 2007-01-11 09:03:54
0: cs:StandAlone st:Primary/Unknown ds:UpToDate/DUnknown r---
ns:0 nr:0 dw:104 dr:273 al:0 bm:0 lo:0 pe:0 ua:0 ap:0
resync: used:0/31 hits:0 misses:0 starving:0 dirty:0 changed:0
act_log: used:0/257 hits:26 misses:0 starving:0 dirty:0 changed:0
1: cs:StandAlone st:Primary/Unknown ds:UpToDate/DUnknown r---
ns:0 nr:0 dw:40 dr:245 al:0 bm:0 lo:0 pe:0 ua:0 ap:0
resync: used:0/31 hits:0 misses:0 starving:0 dirty:0 changed:0
act_log: used:0/257 hits:10 misses:0 starving:0 dirty:0 changed:0
2: cs:StandAlone st:Primary/Unknown ds:UpToDate/DUnknown r---
ns:0 nr:0 dw:40 dr:309 al:0 bm:0 lo:0 pe:0 ua:0 ap:0
resync: used:0/31 hits:0 misses:0 starving:0 dirty:0 changed:0
act_log: used:0/257 hits:10 misses:0 starving:0 dirty:0 changed:0
However, when I bring up the secondary node (bar), drbd will not connect to
foo. I've given the output of starting drbd below. There is also the output
of 'cat /proc/drbd' along with the drbd.conf. I'm guessing my configuration
is incorrect. Any ideas where it is wrong?
Thanks,
Tom
PS: I added part of the /var/log/syslog from bar at the end.
bar:~# /etc/init.d/drbd start
Starting DRBD resources: [ d0 d1 d2 s0 s1 s2 n0 n1 n2 ].
..........
***************************************************************
DRBD's startup script waits for the peer node(s) to appear.
- In case this node was already a degraded cluster before the
reboot the timeout is 120 seconds. [degr-wfc-timeout]
- If the peer was available before the reboot the timeout will
expire after 30 seconds. [wfc-timeout]
(These values are for resource 'r0'; 0 sec -> wait forever)
To abort waiting enter 'yes' [ 30]:
bar:~# cat /proc/drbd
version: 8.0rc1 (api:86/proto:85)
SVN Revision: 2644 build by tbrown at bar, 2007-01-11 09:04:23
0: cs:WFConnection st:Secondary/Unknown ds:UpToDate/DUnknown C r---
ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0
resync: used:0/31 hits:0 misses:0 starving:0 dirty:0 changed:0
act_log: used:0/257 hits:0 misses:0 starving:0 dirty:0 changed:0
1: cs:WFConnection st:Secondary/Unknown ds:UpToDate/DUnknown C r---
ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0
resync: used:0/31 hits:0 misses:0 starving:0 dirty:0 changed:0
act_log: used:0/257 hits:0 misses:0 starving:0 dirty:0 changed:0
2: cs:WFConnection st:Secondary/Unknown ds:UpToDate/DUnknown C r---
ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0
resync: used:0/31 hits:0 misses:0 starving:0 dirty:0 changed:0
act_log: used:0/257 hits:0 misses:0 starving:0 dirty:0 changed:0
bar:~# cat /etc/drbd.conf
global {
usage-count yes;
}
common {
syncer { rate 68M; }
}
resource r0 {
protocol C;
handlers {
pri-on-incon-degr "echo O > /proc/sysrq-trigger ; halt -f";
pri-lost-after-sb "echo O > /proc/sysrq-trigger ; halt -f";
local-io-error "echo O > /proc/sysrq-trigger ; halt -f";
outdate-peer "/usr/sbin/drbd-peer-outdater";
}
startup {
wfc-timeout 30;
degr-wfc-timeout 120; # 2 minutes.
}
disk {
on-io-error detach;
}
net {
after-sb-0pri disconnect;
after-sb-1pri disconnect;
after-sb-2pri disconnect;
rr-conflict disconnect;
}
syncer {
al-extents 257;
}
on foo {
device /dev/drbd0;
disk /dev/hdb5;
address 192.168.1.14:7788;
meta-disk /dev/hdb1 [0];
}
on bar {
device /dev/drbd0;
disk /dev/sda8;
address 192.168.1.15:7788;
meta-disk /dev/sda5 [0];
}
}
resource r1 {
protocol C;
handlers {
pri-on-incon-degr "echo O > /proc/sysrq-trigger ; halt -f";
pri-lost-after-sb "echo O > /proc/sysrq-trigger ; halt -f";
local-io-error "echo O > /proc/sysrq-trigger ; halt -f";
outdate-peer "/usr/sbin/drbd-peer-outdater";
}
startup {
wfc-timeout 30;
degr-wfc-timeout 120; # 2 minutes.
}
disk {
on-io-error detach;
}
net {
after-sb-0pri disconnect;
after-sb-1pri disconnect;
after-sb-2pri disconnect;
rr-conflict disconnect;
}
syncer {
after "r0";
al-extents 257;
}
on foo {
device /dev/drbd1;
disk /dev/hdb6;
address 192.168.1.14:7789;
meta-disk /dev/hdb2 [0];
}
on bar {
device /dev/drbd1;
disk /dev/sda9;
address 192.168.1.15:7789;
meta-disk /dev/sda6 [0];
}
}
resource r2 {
protocol C;
handlers {
pri-on-incon-degr "echo O > /proc/sysrq-trigger ; halt -f";
pri-lost-after-sb "echo O > /proc/sysrq-trigger ; halt -f";
local-io-error "echo O > /proc/sysrq-trigger ; halt -f";
outdate-peer "/usr/sbin/drbd-peer-outdater";
}
startup {
wfc-timeout 30;
degr-wfc-timeout 120; # 2 minutes.
}
disk {
on-io-error detach;
}
net {
after-sb-0pri disconnect;
after-sb-1pri disconnect;
after-sb-2pri disconnect;
rr-conflict disconnect;
}
syncer {
after "r1";
al-extents 257;
}
on foo {
device /dev/drbd2;
disk /dev/hdb7;
address 192.168.1.14:7790;
meta-disk /dev/hdb3 [0];
}
on bar {
device /dev/drbd2;
disk /dev/sda10;
address 192.168.1.15:7790;
meta-disk /dev/sda7 [0];
}
}
Relavant part of /var/log/syslog:
Jan 15 15:11:34 bar kernel: drbd: initialised. Version: 8.0rc1
(api:86/proto:85)
Jan 15 15:11:34 bar kernel: drbd: SVN Revision: 2644 build by tbrown at bar,
2007-01-11 09:04:23
Jan 15 15:11:34 bar kernel: drbd: registered as block device major 147
Jan 15 15:11:34 bar kernel: drbd: minor_table @ 0xde694c60
Jan 15 15:11:34 bar kernel: drbd0: disk( Diskless -> Attaching )
Jan 15 15:11:34 bar kernel: klogd 1.4.1, ---------- state change ----------
Jan 15 15:11:34 bar kernel: drbd0: No usable activity log found.
Jan 15 15:11:34 bar kernel: drbd0: max_segment_size ( = BIO size ) = 32768
Jan 15 15:11:34 bar kernel: drbd0: drbd_bm_resize called with capacity ==
14681457
Jan 15 15:11:34 bar kernel: drbd0: resync bitmap: bits=1835183 words=57350
Jan 15 15:11:34 bar kernel: drbd0: size = 7168 MB (7340728 KB)
Jan 15 15:11:34 bar kernel: drbd0: reading of bitmap took 6 jiffies
Jan 15 15:11:34 bar kernel: drbd0: recounting of set bits took additional 0
jiffies
Jan 15 15:11:34 bar kernel: drbd0: 0 KB marked out-of-sync by on disk bit-map.
Jan 15 15:11:34 bar kernel: drbd0: disk( Attaching -> UpToDate )
Jan 15 15:11:34 bar kernel: drbd0: Writing meta data super block now.
Jan 15 15:11:34 bar kernel: drbd1: disk( Diskless -> Attaching )
Jan 15 15:11:34 bar kernel: drbd1: No usable activity log found.
Jan 15 15:11:34 bar kernel: drbd1: max_segment_size ( = BIO size ) = 32768
Jan 15 15:11:34 bar kernel: drbd1: drbd_bm_resize called with capacity ==
14680449
Jan 15 15:11:34 bar kernel: drbd1: resync bitmap: bits=1835057 words=57346
Jan 15 15:11:34 bar kernel: drbd1: size = 7168 MB (7340224 KB)
Jan 15 15:11:34 bar kernel: drbd1: reading of bitmap took 4 jiffies
Jan 15 15:11:34 bar kernel: drbd1: recounting of set bits took additional 1
jiffies
Jan 15 15:11:34 bar kernel: drbd1: 0 KB marked out-of-sync by on disk bit-map.
Jan 15 15:11:34 bar kernel: drbd1: disk( Attaching -> UpToDate )
Jan 15 15:11:34 bar kernel: drbd1: Writing meta data super block now.
Jan 15 15:11:34 bar kernel: drbd2: disk( Diskless -> Attaching )
Jan 15 15:11:34 bar kernel: drbd2: No usable activity log found.
Jan 15 15:11:34 bar kernel: drbd2: max_segment_size ( = BIO size ) = 32768
Jan 15 15:11:34 bar kernel: drbd2: drbd_bm_resize called with capacity ==
18874737
Jan 15 15:11:34 bar kernel: drbd2: resync bitmap: bits=2359343 words=73730
Jan 15 15:11:34 bar kernel: drbd2: size = 9216 MB (9437368 KB)
Jan 15 15:11:34 bar kernel: drbd2: reading of bitmap took 2 jiffies
Jan 15 15:11:34 bar kernel: drbd2: recounting of set bits took additional 0
jiffies
Jan 15 15:11:34 bar kernel: drbd2: 0 KB marked out-of-sync by on disk bit-map.
Jan 15 15:11:34 bar kernel: drbd2: disk( Attaching -> UpToDate )
Jan 15 15:11:34 bar kernel: drbd2: Writing meta data super block now.
Jan 15 15:11:34 bar kernel: drbd0: conn( StandAlone -> Unconnected )
Jan 15 15:11:34 bar kernel: drbd0: receiver (re)started
Jan 15 15:11:34 bar kernel: drbd0: conn( Unconnected -> WFConnection )
Jan 15 15:11:34 bar kernel: drbd1: conn( StandAlone -> Unconnected )
Jan 15 15:11:34 bar kernel: drbd1: receiver (re)started
Jan 15 15:11:34 bar kernel: drbd1: conn( Unconnected -> WFConnection )
Jan 15 15:11:34 bar kernel: drbd2: conn( StandAlone -> Unconnected )
Jan 15 15:11:34 bar kernel: drbd2: receiver (re)started
Jan 15 15:11:34 bar kernel: drbd2: conn( Unconnected -> WFConnection )