[DRBD-user] drbd resource fencing - 2nd try with more information

Dominik Klein dk at in-telegence.net
Thu Dec 6 00:25:45 CET 2007


> drbd peer outedater is maintained in the heartbeat repository by us. That 
> means - use the dopd and drbd-peer-outdater that comes with heartbeat, if 
> you have some problem ask us.
> We will remove it from the drbd source to avoid confusion.
> 
> From the logs below it seems that dopd on one node did not talk to dopd on 
> the second node. Is actually dopd running on both nodes?
> 
> Rasto

primary
dktest1debian:~# ps -ef|grep dopd
1001     13692 13618  0 Dec05 ?        00:00:00 /usr/lib/heartbeat/dopd
root     14032 14004  0 07:19 pts/2    00:00:00 grep dopd

secondary
dktest2debian:~# ps -ef|grep dopd
1001     14065 13996  0 Dec05 ?        00:00:00 /usr/lib/heartbeat/dopd
root     14370 14364  0 07:14 pts/3    00:00:00 grep dopd

primary
dktest1debian:~# cat /proc/drbd
version: 8.0.7 (api:86/proto:86)
GIT-hash: cf14288833afe95db396075f8530a5960d29e498 build by 
root at dktest1debian, 2007-12-04 09:11:56

  2: cs:Connected st:Primary/Secondary ds:UpToDate/UpToDate C r---
     ns:40136 nr:0 dw:79920 dr:45374 al:3 bm:78 lo:0 pe:0 ua:0 ap:0
         resync: used:0/31 hits:2400 misses:46 starving:0 dirty:0 changed:46
         act_log: used:0/127 hits:19977 misses:3 starving:0 dirty:0 
changed:3

secondary
dktest2debian:~# cat /proc/drbd
version: 8.0.7 (api:86/proto:86)
GIT-hash: cf14288833afe95db396075f8530a5960d29e498 build by 
root at dktest2debian, 2007-12-04 09:13:54

  2: cs:Connected st:Secondary/Primary ds:UpToDate/UpToDate C r---
     ns:0 nr:1936 dw:1936 dr:0 al:0 bm:8 lo:0 pe:0 ua:0 ap:0
         resync: used:0/31 hits:106 misses:8 starving:0 dirty:0 changed:8
         act_log: used:0/127 hits:0 misses:0 starving:0 dirty:0 changed:0

Now unplug DRBD link.

primary
dktest1debian:~# cat /proc/drbd
version: 8.0.7 (api:86/proto:86)
GIT-hash: cf14288833afe95db396075f8530a5960d29e498 build by 
root at dktest1debian, 2007-12-04 09:11:56

  2: cs:WFConnection st:Primary/Unknown ds:UpToDate/DUnknown C r---
     ns:40136 nr:0 dw:79920 dr:45374 al:3 bm:78 lo:0 pe:0 ua:0 ap:0
         resync: used:0/31 hits:2400 misses:46 starving:0 dirty:0 changed:46
         act_log: used:0/127 hits:19977 misses:3 starving:0 dirty:0 
changed:3

secondary
dktest2debian:~# cat /proc/drbd
version: 8.0.7 (api:86/proto:86)
GIT-hash: cf14288833afe95db396075f8530a5960d29e498 build by 
root at dktest2debian, 2007-12-04 09:13:54

  2: cs:WFConnection st:Secondary/Unknown ds:UpToDate/DUnknown C r---
     ns:0 nr:1936 dw:1936 dr:0 al:0 bm:8 lo:0 pe:0 ua:0 ap:0
         resync: used:0/31 hits:106 misses:8 starving:0 dirty:0 changed:8
         act_log: used:0/127 hits:0 misses:0 starving:0 dirty:0 changed:0


primary logs
Dec  6 07:15:08 dktest1debian kernel: drbd2: PingAck did not arrive in time.
Dec  6 07:15:08 dktest1debian kernel: drbd2: peer( Secondary -> Unknown 
) conn( Connected -> NetworkFailure ) pdsk( UpToDate -> DUnknown )
Dec  6 07:15:08 dktest1debian kernel: drbd2: Creating new current UUID
Dec  6 07:15:08 dktest1debian kernel: drbd2: asender terminated
Dec  6 07:15:08 dktest1debian kernel: drbd2: short read expecting header 
on sock: r=-512
Dec  6 07:15:08 dktest1debian kernel: drbd2: tl_clear()
Dec  6 07:15:08 dktest1debian kernel: drbd2: Connection closed
Dec  6 07:15:08 dktest1debian kernel: drbd2: Writing meta data super 
block now.
Dec  6 07:15:08 dktest1debian kernel: drbd2: helper command: 
/sbin/drbdadm outdate-peer
Dec  6 07:15:08 dktest1debian drbd-peer-outdater: [14017]: debug: drbd 
peer: dktest2debian
Dec  6 07:15:08 dktest1debian drbd-peer-outdater: [14017]: debug: drbd 
resource: drbd2
Dec  6 07:15:08 dktest1debian /usr/lib/heartbeat/dopd: [13692]: debug: 
Connecting channel
Dec  6 07:15:08 dktest1debian /usr/lib/heartbeat/dopd: [13692]: debug: 
Client outdater (0x8050868) connected
Dec  6 07:15:08 dktest1debian /usr/lib/heartbeat/dopd: [13692]: debug: 
invoked: outdater
Dec  6 07:15:08 dktest1debian /usr/lib/heartbeat/dopd: [13692]: debug: 
Processed 0 messages
Dec  6 07:15:08 dktest1debian /usr/lib/heartbeat/dopd: [13692]: debug: 
Deleting outdater (0x8050868) from mainloop
Dec  6 07:15:08 dktest1debian kernel: drbd2: State change failed: 
Refusing to be Primary without at least one UpToDate disk
Dec  6 07:15:08 dktest1debian kernel: drbd2:   state = { 
cs:NetworkFailure st:Primary/Unknown ds:UpToDate/DUnknown r--- }
Dec  6 07:15:08 dktest1debian kernel: drbd2:  wanted = { 
cs:NetworkFailure st:Primary/Unknown ds:Outdated/DUnknown r--- }
Dec  6 07:15:08 dktest1debian kernel: drbd2: outdate-peer helper broken, 
returned 255

This looks suspicious. And what I just noticed: if I start 
"/usr/lib/heartbeat/drbd-peer-outdater -p dktest2debian -r drbd2" by 
hand, it ends in a segfault.

Dec  6 07:15:08 dktest1debian kernel: drbd2: Forcing state change from 
bad state. Error would be: 'Refusing to be Primary while peer is not 
outdated'
Dec  6 07:15:08 dktest1debian kernel: drbd2:  old = { cs:NetworkFailure 
st:Primary/Unknown ds:UpToDate/DUnknown r--- }
Dec  6 07:15:08 dktest1debian kernel: drbd2:  new = { cs:Unconnected 
st:Primary/Unknown ds:UpToDate/DUnknown r--- }
Dec  6 07:15:08 dktest1debian kernel: drbd2: conn( NetworkFailure -> 
Unconnected )
Dec  6 07:15:08 dktest1debian kernel: drbd2: receiver terminated
Dec  6 07:15:08 dktest1debian kernel: drbd2: receiver (re)started
Dec  6 07:15:08 dktest1debian /usr/lib/heartbeat/dopd: [13692]: debug: 
connection from client closed
Dec  6 07:15:08 dktest1debian kernel: drbd2: Forcing state change from 
bad state. Error would be: 'Refusing to be Primary while peer is not 
outdated'
Dec  6 07:15:08 dktest1debian kernel: drbd2:  old = { cs:Unconnected 
st:Primary/Unknown ds:UpToDate/DUnknown r--- }
Dec  6 07:15:08 dktest1debian kernel: drbd2:  new = { cs:WFConnection 
st:Primary/Unknown ds:UpToDate/DUnknown r--- }
Dec  6 07:15:08 dktest1debian kernel: drbd2: conn( Unconnected -> 
WFConnection )

secondary logs
Dec  6 07:15:07 dktest2debian kernel: drbd2: PingAck did not arrive in time.
Dec  6 07:15:07 dktest2debian kernel: drbd2: peer( Primary -> Unknown ) 
conn( Connected -> NetworkFailure ) pdsk( UpToDate -> DUnknown )
Dec  6 07:15:07 dktest2debian kernel: drbd2: asender terminated
Dec  6 07:15:07 dktest2debian kernel: drbd2: short read expecting header 
on sock: r=-512
Dec  6 07:15:07 dktest2debian kernel: drbd2: tl_clear()
Dec  6 07:15:07 dktest2debian kernel: drbd2: Connection closed
Dec  6 07:15:07 dktest2debian kernel: drbd2: Writing meta data super 
block now.
Dec  6 07:15:07 dktest2debian kernel: drbd2: conn( NetworkFailure -> 
Unconnected )
Dec  6 07:15:07 dktest2debian kernel: drbd2: receiver terminated
Dec  6 07:15:07 dktest2debian kernel: drbd2: receiver (re)started
Dec  6 07:15:07 dktest2debian kernel: drbd2: conn( Unconnected -> 
WFConnection )

Regards
Dominik



More information about the drbd-user mailing list