Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
> drbd peer outedater is maintained in the heartbeat repository by us. That > means - use the dopd and drbd-peer-outdater that comes with heartbeat, if > you have some problem ask us. > We will remove it from the drbd source to avoid confusion. > > From the logs below it seems that dopd on one node did not talk to dopd on > the second node. Is actually dopd running on both nodes? > > Rasto primary dktest1debian:~# ps -ef|grep dopd 1001 13692 13618 0 Dec05 ? 00:00:00 /usr/lib/heartbeat/dopd root 14032 14004 0 07:19 pts/2 00:00:00 grep dopd secondary dktest2debian:~# ps -ef|grep dopd 1001 14065 13996 0 Dec05 ? 00:00:00 /usr/lib/heartbeat/dopd root 14370 14364 0 07:14 pts/3 00:00:00 grep dopd primary dktest1debian:~# cat /proc/drbd version: 8.0.7 (api:86/proto:86) GIT-hash: cf14288833afe95db396075f8530a5960d29e498 build by root at dktest1debian, 2007-12-04 09:11:56 2: cs:Connected st:Primary/Secondary ds:UpToDate/UpToDate C r--- ns:40136 nr:0 dw:79920 dr:45374 al:3 bm:78 lo:0 pe:0 ua:0 ap:0 resync: used:0/31 hits:2400 misses:46 starving:0 dirty:0 changed:46 act_log: used:0/127 hits:19977 misses:3 starving:0 dirty:0 changed:3 secondary dktest2debian:~# cat /proc/drbd version: 8.0.7 (api:86/proto:86) GIT-hash: cf14288833afe95db396075f8530a5960d29e498 build by root at dktest2debian, 2007-12-04 09:13:54 2: cs:Connected st:Secondary/Primary ds:UpToDate/UpToDate C r--- ns:0 nr:1936 dw:1936 dr:0 al:0 bm:8 lo:0 pe:0 ua:0 ap:0 resync: used:0/31 hits:106 misses:8 starving:0 dirty:0 changed:8 act_log: used:0/127 hits:0 misses:0 starving:0 dirty:0 changed:0 Now unplug DRBD link. primary dktest1debian:~# cat /proc/drbd version: 8.0.7 (api:86/proto:86) GIT-hash: cf14288833afe95db396075f8530a5960d29e498 build by root at dktest1debian, 2007-12-04 09:11:56 2: cs:WFConnection st:Primary/Unknown ds:UpToDate/DUnknown C r--- ns:40136 nr:0 dw:79920 dr:45374 al:3 bm:78 lo:0 pe:0 ua:0 ap:0 resync: used:0/31 hits:2400 misses:46 starving:0 dirty:0 changed:46 act_log: used:0/127 hits:19977 misses:3 starving:0 dirty:0 changed:3 secondary dktest2debian:~# cat /proc/drbd version: 8.0.7 (api:86/proto:86) GIT-hash: cf14288833afe95db396075f8530a5960d29e498 build by root at dktest2debian, 2007-12-04 09:13:54 2: cs:WFConnection st:Secondary/Unknown ds:UpToDate/DUnknown C r--- ns:0 nr:1936 dw:1936 dr:0 al:0 bm:8 lo:0 pe:0 ua:0 ap:0 resync: used:0/31 hits:106 misses:8 starving:0 dirty:0 changed:8 act_log: used:0/127 hits:0 misses:0 starving:0 dirty:0 changed:0 primary logs Dec 6 07:15:08 dktest1debian kernel: drbd2: PingAck did not arrive in time. Dec 6 07:15:08 dktest1debian kernel: drbd2: peer( Secondary -> Unknown ) conn( Connected -> NetworkFailure ) pdsk( UpToDate -> DUnknown ) Dec 6 07:15:08 dktest1debian kernel: drbd2: Creating new current UUID Dec 6 07:15:08 dktest1debian kernel: drbd2: asender terminated Dec 6 07:15:08 dktest1debian kernel: drbd2: short read expecting header on sock: r=-512 Dec 6 07:15:08 dktest1debian kernel: drbd2: tl_clear() Dec 6 07:15:08 dktest1debian kernel: drbd2: Connection closed Dec 6 07:15:08 dktest1debian kernel: drbd2: Writing meta data super block now. Dec 6 07:15:08 dktest1debian kernel: drbd2: helper command: /sbin/drbdadm outdate-peer Dec 6 07:15:08 dktest1debian drbd-peer-outdater: [14017]: debug: drbd peer: dktest2debian Dec 6 07:15:08 dktest1debian drbd-peer-outdater: [14017]: debug: drbd resource: drbd2 Dec 6 07:15:08 dktest1debian /usr/lib/heartbeat/dopd: [13692]: debug: Connecting channel Dec 6 07:15:08 dktest1debian /usr/lib/heartbeat/dopd: [13692]: debug: Client outdater (0x8050868) connected Dec 6 07:15:08 dktest1debian /usr/lib/heartbeat/dopd: [13692]: debug: invoked: outdater Dec 6 07:15:08 dktest1debian /usr/lib/heartbeat/dopd: [13692]: debug: Processed 0 messages Dec 6 07:15:08 dktest1debian /usr/lib/heartbeat/dopd: [13692]: debug: Deleting outdater (0x8050868) from mainloop Dec 6 07:15:08 dktest1debian kernel: drbd2: State change failed: Refusing to be Primary without at least one UpToDate disk Dec 6 07:15:08 dktest1debian kernel: drbd2: state = { cs:NetworkFailure st:Primary/Unknown ds:UpToDate/DUnknown r--- } Dec 6 07:15:08 dktest1debian kernel: drbd2: wanted = { cs:NetworkFailure st:Primary/Unknown ds:Outdated/DUnknown r--- } Dec 6 07:15:08 dktest1debian kernel: drbd2: outdate-peer helper broken, returned 255 This looks suspicious. And what I just noticed: if I start "/usr/lib/heartbeat/drbd-peer-outdater -p dktest2debian -r drbd2" by hand, it ends in a segfault. Dec 6 07:15:08 dktest1debian kernel: drbd2: Forcing state change from bad state. Error would be: 'Refusing to be Primary while peer is not outdated' Dec 6 07:15:08 dktest1debian kernel: drbd2: old = { cs:NetworkFailure st:Primary/Unknown ds:UpToDate/DUnknown r--- } Dec 6 07:15:08 dktest1debian kernel: drbd2: new = { cs:Unconnected st:Primary/Unknown ds:UpToDate/DUnknown r--- } Dec 6 07:15:08 dktest1debian kernel: drbd2: conn( NetworkFailure -> Unconnected ) Dec 6 07:15:08 dktest1debian kernel: drbd2: receiver terminated Dec 6 07:15:08 dktest1debian kernel: drbd2: receiver (re)started Dec 6 07:15:08 dktest1debian /usr/lib/heartbeat/dopd: [13692]: debug: connection from client closed Dec 6 07:15:08 dktest1debian kernel: drbd2: Forcing state change from bad state. Error would be: 'Refusing to be Primary while peer is not outdated' Dec 6 07:15:08 dktest1debian kernel: drbd2: old = { cs:Unconnected st:Primary/Unknown ds:UpToDate/DUnknown r--- } Dec 6 07:15:08 dktest1debian kernel: drbd2: new = { cs:WFConnection st:Primary/Unknown ds:UpToDate/DUnknown r--- } Dec 6 07:15:08 dktest1debian kernel: drbd2: conn( Unconnected -> WFConnection ) secondary logs Dec 6 07:15:07 dktest2debian kernel: drbd2: PingAck did not arrive in time. Dec 6 07:15:07 dktest2debian kernel: drbd2: peer( Primary -> Unknown ) conn( Connected -> NetworkFailure ) pdsk( UpToDate -> DUnknown ) Dec 6 07:15:07 dktest2debian kernel: drbd2: asender terminated Dec 6 07:15:07 dktest2debian kernel: drbd2: short read expecting header on sock: r=-512 Dec 6 07:15:07 dktest2debian kernel: drbd2: tl_clear() Dec 6 07:15:07 dktest2debian kernel: drbd2: Connection closed Dec 6 07:15:07 dktest2debian kernel: drbd2: Writing meta data super block now. Dec 6 07:15:07 dktest2debian kernel: drbd2: conn( NetworkFailure -> Unconnected ) Dec 6 07:15:07 dktest2debian kernel: drbd2: receiver terminated Dec 6 07:15:07 dktest2debian kernel: drbd2: receiver (re)started Dec 6 07:15:07 dktest2debian kernel: drbd2: conn( Unconnected -> WFConnection ) Regards Dominik