Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hi, I have drbd and heartbeat installed on two nodes: foo and bar. When I bring up the primary node (foo) drbd starts up and then hearbeat starts up and foo goes into primary mode and heartbeat mounts the filesystems. foo:~# cat /proc/drbd version: 8.0rc1 (api:86/proto:85) SVN Revision: 2644 build by tbrown at foo, 2007-01-11 09:03:54 0: cs:StandAlone st:Primary/Unknown ds:UpToDate/DUnknown r--- ns:0 nr:0 dw:104 dr:273 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 resync: used:0/31 hits:0 misses:0 starving:0 dirty:0 changed:0 act_log: used:0/257 hits:26 misses:0 starving:0 dirty:0 changed:0 1: cs:StandAlone st:Primary/Unknown ds:UpToDate/DUnknown r--- ns:0 nr:0 dw:40 dr:245 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 resync: used:0/31 hits:0 misses:0 starving:0 dirty:0 changed:0 act_log: used:0/257 hits:10 misses:0 starving:0 dirty:0 changed:0 2: cs:StandAlone st:Primary/Unknown ds:UpToDate/DUnknown r--- ns:0 nr:0 dw:40 dr:309 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 resync: used:0/31 hits:0 misses:0 starving:0 dirty:0 changed:0 act_log: used:0/257 hits:10 misses:0 starving:0 dirty:0 changed:0 However, when I bring up the secondary node (bar), drbd will not connect to foo. I've given the output of starting drbd below. There is also the output of 'cat /proc/drbd' along with the drbd.conf. I'm guessing my configuration is incorrect. Any ideas where it is wrong? Thanks, Tom PS: I added part of the /var/log/syslog from bar at the end. bar:~# /etc/init.d/drbd start Starting DRBD resources: [ d0 d1 d2 s0 s1 s2 n0 n1 n2 ]. .......... *************************************************************** DRBD's startup script waits for the peer node(s) to appear. - In case this node was already a degraded cluster before the reboot the timeout is 120 seconds. [degr-wfc-timeout] - If the peer was available before the reboot the timeout will expire after 30 seconds. [wfc-timeout] (These values are for resource 'r0'; 0 sec -> wait forever) To abort waiting enter 'yes' [ 30]: bar:~# cat /proc/drbd version: 8.0rc1 (api:86/proto:85) SVN Revision: 2644 build by tbrown at bar, 2007-01-11 09:04:23 0: cs:WFConnection st:Secondary/Unknown ds:UpToDate/DUnknown C r--- ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 resync: used:0/31 hits:0 misses:0 starving:0 dirty:0 changed:0 act_log: used:0/257 hits:0 misses:0 starving:0 dirty:0 changed:0 1: cs:WFConnection st:Secondary/Unknown ds:UpToDate/DUnknown C r--- ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 resync: used:0/31 hits:0 misses:0 starving:0 dirty:0 changed:0 act_log: used:0/257 hits:0 misses:0 starving:0 dirty:0 changed:0 2: cs:WFConnection st:Secondary/Unknown ds:UpToDate/DUnknown C r--- ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 resync: used:0/31 hits:0 misses:0 starving:0 dirty:0 changed:0 act_log: used:0/257 hits:0 misses:0 starving:0 dirty:0 changed:0 bar:~# cat /etc/drbd.conf global { usage-count yes; } common { syncer { rate 68M; } } resource r0 { protocol C; handlers { pri-on-incon-degr "echo O > /proc/sysrq-trigger ; halt -f"; pri-lost-after-sb "echo O > /proc/sysrq-trigger ; halt -f"; local-io-error "echo O > /proc/sysrq-trigger ; halt -f"; outdate-peer "/usr/sbin/drbd-peer-outdater"; } startup { wfc-timeout 30; degr-wfc-timeout 120; # 2 minutes. } disk { on-io-error detach; } net { after-sb-0pri disconnect; after-sb-1pri disconnect; after-sb-2pri disconnect; rr-conflict disconnect; } syncer { al-extents 257; } on foo { device /dev/drbd0; disk /dev/hdb5; address 192.168.1.14:7788; meta-disk /dev/hdb1 [0]; } on bar { device /dev/drbd0; disk /dev/sda8; address 192.168.1.15:7788; meta-disk /dev/sda5 [0]; } } resource r1 { protocol C; handlers { pri-on-incon-degr "echo O > /proc/sysrq-trigger ; halt -f"; pri-lost-after-sb "echo O > /proc/sysrq-trigger ; halt -f"; local-io-error "echo O > /proc/sysrq-trigger ; halt -f"; outdate-peer "/usr/sbin/drbd-peer-outdater"; } startup { wfc-timeout 30; degr-wfc-timeout 120; # 2 minutes. } disk { on-io-error detach; } net { after-sb-0pri disconnect; after-sb-1pri disconnect; after-sb-2pri disconnect; rr-conflict disconnect; } syncer { after "r0"; al-extents 257; } on foo { device /dev/drbd1; disk /dev/hdb6; address 192.168.1.14:7789; meta-disk /dev/hdb2 [0]; } on bar { device /dev/drbd1; disk /dev/sda9; address 192.168.1.15:7789; meta-disk /dev/sda6 [0]; } } resource r2 { protocol C; handlers { pri-on-incon-degr "echo O > /proc/sysrq-trigger ; halt -f"; pri-lost-after-sb "echo O > /proc/sysrq-trigger ; halt -f"; local-io-error "echo O > /proc/sysrq-trigger ; halt -f"; outdate-peer "/usr/sbin/drbd-peer-outdater"; } startup { wfc-timeout 30; degr-wfc-timeout 120; # 2 minutes. } disk { on-io-error detach; } net { after-sb-0pri disconnect; after-sb-1pri disconnect; after-sb-2pri disconnect; rr-conflict disconnect; } syncer { after "r1"; al-extents 257; } on foo { device /dev/drbd2; disk /dev/hdb7; address 192.168.1.14:7790; meta-disk /dev/hdb3 [0]; } on bar { device /dev/drbd2; disk /dev/sda10; address 192.168.1.15:7790; meta-disk /dev/sda7 [0]; } } Relavant part of /var/log/syslog: Jan 15 15:11:34 bar kernel: drbd: initialised. Version: 8.0rc1 (api:86/proto:85) Jan 15 15:11:34 bar kernel: drbd: SVN Revision: 2644 build by tbrown at bar, 2007-01-11 09:04:23 Jan 15 15:11:34 bar kernel: drbd: registered as block device major 147 Jan 15 15:11:34 bar kernel: drbd: minor_table @ 0xde694c60 Jan 15 15:11:34 bar kernel: drbd0: disk( Diskless -> Attaching ) Jan 15 15:11:34 bar kernel: klogd 1.4.1, ---------- state change ---------- Jan 15 15:11:34 bar kernel: drbd0: No usable activity log found. Jan 15 15:11:34 bar kernel: drbd0: max_segment_size ( = BIO size ) = 32768 Jan 15 15:11:34 bar kernel: drbd0: drbd_bm_resize called with capacity == 14681457 Jan 15 15:11:34 bar kernel: drbd0: resync bitmap: bits=1835183 words=57350 Jan 15 15:11:34 bar kernel: drbd0: size = 7168 MB (7340728 KB) Jan 15 15:11:34 bar kernel: drbd0: reading of bitmap took 6 jiffies Jan 15 15:11:34 bar kernel: drbd0: recounting of set bits took additional 0 jiffies Jan 15 15:11:34 bar kernel: drbd0: 0 KB marked out-of-sync by on disk bit-map. Jan 15 15:11:34 bar kernel: drbd0: disk( Attaching -> UpToDate ) Jan 15 15:11:34 bar kernel: drbd0: Writing meta data super block now. Jan 15 15:11:34 bar kernel: drbd1: disk( Diskless -> Attaching ) Jan 15 15:11:34 bar kernel: drbd1: No usable activity log found. Jan 15 15:11:34 bar kernel: drbd1: max_segment_size ( = BIO size ) = 32768 Jan 15 15:11:34 bar kernel: drbd1: drbd_bm_resize called with capacity == 14680449 Jan 15 15:11:34 bar kernel: drbd1: resync bitmap: bits=1835057 words=57346 Jan 15 15:11:34 bar kernel: drbd1: size = 7168 MB (7340224 KB) Jan 15 15:11:34 bar kernel: drbd1: reading of bitmap took 4 jiffies Jan 15 15:11:34 bar kernel: drbd1: recounting of set bits took additional 1 jiffies Jan 15 15:11:34 bar kernel: drbd1: 0 KB marked out-of-sync by on disk bit-map. Jan 15 15:11:34 bar kernel: drbd1: disk( Attaching -> UpToDate ) Jan 15 15:11:34 bar kernel: drbd1: Writing meta data super block now. Jan 15 15:11:34 bar kernel: drbd2: disk( Diskless -> Attaching ) Jan 15 15:11:34 bar kernel: drbd2: No usable activity log found. Jan 15 15:11:34 bar kernel: drbd2: max_segment_size ( = BIO size ) = 32768 Jan 15 15:11:34 bar kernel: drbd2: drbd_bm_resize called with capacity == 18874737 Jan 15 15:11:34 bar kernel: drbd2: resync bitmap: bits=2359343 words=73730 Jan 15 15:11:34 bar kernel: drbd2: size = 9216 MB (9437368 KB) Jan 15 15:11:34 bar kernel: drbd2: reading of bitmap took 2 jiffies Jan 15 15:11:34 bar kernel: drbd2: recounting of set bits took additional 0 jiffies Jan 15 15:11:34 bar kernel: drbd2: 0 KB marked out-of-sync by on disk bit-map. Jan 15 15:11:34 bar kernel: drbd2: disk( Attaching -> UpToDate ) Jan 15 15:11:34 bar kernel: drbd2: Writing meta data super block now. Jan 15 15:11:34 bar kernel: drbd0: conn( StandAlone -> Unconnected ) Jan 15 15:11:34 bar kernel: drbd0: receiver (re)started Jan 15 15:11:34 bar kernel: drbd0: conn( Unconnected -> WFConnection ) Jan 15 15:11:34 bar kernel: drbd1: conn( StandAlone -> Unconnected ) Jan 15 15:11:34 bar kernel: drbd1: receiver (re)started Jan 15 15:11:34 bar kernel: drbd1: conn( Unconnected -> WFConnection ) Jan 15 15:11:34 bar kernel: drbd2: conn( StandAlone -> Unconnected ) Jan 15 15:11:34 bar kernel: drbd2: receiver (re)started Jan 15 15:11:34 bar kernel: drbd2: conn( Unconnected -> WFConnection )