Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
> drbd peer outedater is maintained in the heartbeat repository by us. That
> means - use the dopd and drbd-peer-outdater that comes with heartbeat, if
> you have some problem ask us.
> We will remove it from the drbd source to avoid confusion.
>
> From the logs below it seems that dopd on one node did not talk to dopd on
> the second node. Is actually dopd running on both nodes?
>
> Rasto
primary
dktest1debian:~# ps -ef|grep dopd
1001 13692 13618 0 Dec05 ? 00:00:00 /usr/lib/heartbeat/dopd
root 14032 14004 0 07:19 pts/2 00:00:00 grep dopd
secondary
dktest2debian:~# ps -ef|grep dopd
1001 14065 13996 0 Dec05 ? 00:00:00 /usr/lib/heartbeat/dopd
root 14370 14364 0 07:14 pts/3 00:00:00 grep dopd
primary
dktest1debian:~# cat /proc/drbd
version: 8.0.7 (api:86/proto:86)
GIT-hash: cf14288833afe95db396075f8530a5960d29e498 build by
root at dktest1debian, 2007-12-04 09:11:56
2: cs:Connected st:Primary/Secondary ds:UpToDate/UpToDate C r---
ns:40136 nr:0 dw:79920 dr:45374 al:3 bm:78 lo:0 pe:0 ua:0 ap:0
resync: used:0/31 hits:2400 misses:46 starving:0 dirty:0 changed:46
act_log: used:0/127 hits:19977 misses:3 starving:0 dirty:0
changed:3
secondary
dktest2debian:~# cat /proc/drbd
version: 8.0.7 (api:86/proto:86)
GIT-hash: cf14288833afe95db396075f8530a5960d29e498 build by
root at dktest2debian, 2007-12-04 09:13:54
2: cs:Connected st:Secondary/Primary ds:UpToDate/UpToDate C r---
ns:0 nr:1936 dw:1936 dr:0 al:0 bm:8 lo:0 pe:0 ua:0 ap:0
resync: used:0/31 hits:106 misses:8 starving:0 dirty:0 changed:8
act_log: used:0/127 hits:0 misses:0 starving:0 dirty:0 changed:0
Now unplug DRBD link.
primary
dktest1debian:~# cat /proc/drbd
version: 8.0.7 (api:86/proto:86)
GIT-hash: cf14288833afe95db396075f8530a5960d29e498 build by
root at dktest1debian, 2007-12-04 09:11:56
2: cs:WFConnection st:Primary/Unknown ds:UpToDate/DUnknown C r---
ns:40136 nr:0 dw:79920 dr:45374 al:3 bm:78 lo:0 pe:0 ua:0 ap:0
resync: used:0/31 hits:2400 misses:46 starving:0 dirty:0 changed:46
act_log: used:0/127 hits:19977 misses:3 starving:0 dirty:0
changed:3
secondary
dktest2debian:~# cat /proc/drbd
version: 8.0.7 (api:86/proto:86)
GIT-hash: cf14288833afe95db396075f8530a5960d29e498 build by
root at dktest2debian, 2007-12-04 09:13:54
2: cs:WFConnection st:Secondary/Unknown ds:UpToDate/DUnknown C r---
ns:0 nr:1936 dw:1936 dr:0 al:0 bm:8 lo:0 pe:0 ua:0 ap:0
resync: used:0/31 hits:106 misses:8 starving:0 dirty:0 changed:8
act_log: used:0/127 hits:0 misses:0 starving:0 dirty:0 changed:0
primary logs
Dec 6 07:15:08 dktest1debian kernel: drbd2: PingAck did not arrive in time.
Dec 6 07:15:08 dktest1debian kernel: drbd2: peer( Secondary -> Unknown
) conn( Connected -> NetworkFailure ) pdsk( UpToDate -> DUnknown )
Dec 6 07:15:08 dktest1debian kernel: drbd2: Creating new current UUID
Dec 6 07:15:08 dktest1debian kernel: drbd2: asender terminated
Dec 6 07:15:08 dktest1debian kernel: drbd2: short read expecting header
on sock: r=-512
Dec 6 07:15:08 dktest1debian kernel: drbd2: tl_clear()
Dec 6 07:15:08 dktest1debian kernel: drbd2: Connection closed
Dec 6 07:15:08 dktest1debian kernel: drbd2: Writing meta data super
block now.
Dec 6 07:15:08 dktest1debian kernel: drbd2: helper command:
/sbin/drbdadm outdate-peer
Dec 6 07:15:08 dktest1debian drbd-peer-outdater: [14017]: debug: drbd
peer: dktest2debian
Dec 6 07:15:08 dktest1debian drbd-peer-outdater: [14017]: debug: drbd
resource: drbd2
Dec 6 07:15:08 dktest1debian /usr/lib/heartbeat/dopd: [13692]: debug:
Connecting channel
Dec 6 07:15:08 dktest1debian /usr/lib/heartbeat/dopd: [13692]: debug:
Client outdater (0x8050868) connected
Dec 6 07:15:08 dktest1debian /usr/lib/heartbeat/dopd: [13692]: debug:
invoked: outdater
Dec 6 07:15:08 dktest1debian /usr/lib/heartbeat/dopd: [13692]: debug:
Processed 0 messages
Dec 6 07:15:08 dktest1debian /usr/lib/heartbeat/dopd: [13692]: debug:
Deleting outdater (0x8050868) from mainloop
Dec 6 07:15:08 dktest1debian kernel: drbd2: State change failed:
Refusing to be Primary without at least one UpToDate disk
Dec 6 07:15:08 dktest1debian kernel: drbd2: state = {
cs:NetworkFailure st:Primary/Unknown ds:UpToDate/DUnknown r--- }
Dec 6 07:15:08 dktest1debian kernel: drbd2: wanted = {
cs:NetworkFailure st:Primary/Unknown ds:Outdated/DUnknown r--- }
Dec 6 07:15:08 dktest1debian kernel: drbd2: outdate-peer helper broken,
returned 255
This looks suspicious. And what I just noticed: if I start
"/usr/lib/heartbeat/drbd-peer-outdater -p dktest2debian -r drbd2" by
hand, it ends in a segfault.
Dec 6 07:15:08 dktest1debian kernel: drbd2: Forcing state change from
bad state. Error would be: 'Refusing to be Primary while peer is not
outdated'
Dec 6 07:15:08 dktest1debian kernel: drbd2: old = { cs:NetworkFailure
st:Primary/Unknown ds:UpToDate/DUnknown r--- }
Dec 6 07:15:08 dktest1debian kernel: drbd2: new = { cs:Unconnected
st:Primary/Unknown ds:UpToDate/DUnknown r--- }
Dec 6 07:15:08 dktest1debian kernel: drbd2: conn( NetworkFailure ->
Unconnected )
Dec 6 07:15:08 dktest1debian kernel: drbd2: receiver terminated
Dec 6 07:15:08 dktest1debian kernel: drbd2: receiver (re)started
Dec 6 07:15:08 dktest1debian /usr/lib/heartbeat/dopd: [13692]: debug:
connection from client closed
Dec 6 07:15:08 dktest1debian kernel: drbd2: Forcing state change from
bad state. Error would be: 'Refusing to be Primary while peer is not
outdated'
Dec 6 07:15:08 dktest1debian kernel: drbd2: old = { cs:Unconnected
st:Primary/Unknown ds:UpToDate/DUnknown r--- }
Dec 6 07:15:08 dktest1debian kernel: drbd2: new = { cs:WFConnection
st:Primary/Unknown ds:UpToDate/DUnknown r--- }
Dec 6 07:15:08 dktest1debian kernel: drbd2: conn( Unconnected ->
WFConnection )
secondary logs
Dec 6 07:15:07 dktest2debian kernel: drbd2: PingAck did not arrive in time.
Dec 6 07:15:07 dktest2debian kernel: drbd2: peer( Primary -> Unknown )
conn( Connected -> NetworkFailure ) pdsk( UpToDate -> DUnknown )
Dec 6 07:15:07 dktest2debian kernel: drbd2: asender terminated
Dec 6 07:15:07 dktest2debian kernel: drbd2: short read expecting header
on sock: r=-512
Dec 6 07:15:07 dktest2debian kernel: drbd2: tl_clear()
Dec 6 07:15:07 dktest2debian kernel: drbd2: Connection closed
Dec 6 07:15:07 dktest2debian kernel: drbd2: Writing meta data super
block now.
Dec 6 07:15:07 dktest2debian kernel: drbd2: conn( NetworkFailure ->
Unconnected )
Dec 6 07:15:07 dktest2debian kernel: drbd2: receiver terminated
Dec 6 07:15:07 dktest2debian kernel: drbd2: receiver (re)started
Dec 6 07:15:07 dktest2debian kernel: drbd2: conn( Unconnected ->
WFConnection )
Regards
Dominik