Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hello all
I have a two nodes cluster on Centos 5.2, kernel
2.6.18-92.1.22.el5.centos.plus, drbd-8.3.0-3 and
drbd-km-2.6.18_92.1.22.el5.centos.plus-8.3.0-3 compiled and installed as rpm
by myself.
Though I do have two GigabitEth NICs connected back-to-back for DRBD and
clustering, from time to time, especially during heavy traffic on the public
GigEth interfaces of the cluster nodes, I get the following:
drbd0: PingAck did not arrive in time.
drbd0: peer( Primary -> Unknown ) conn( Connected -> NetworkFailure ) pdsk(
UpToDate -> DUnknown ) susp( 0 -> 1 )
drbd0: asender terminated
drbd0: Terminating asender thread
drbd0: short read expecting header on sock: r=-512
drbd0: Creating new current UUID
drbd0: Connection closed
drbd0: helper command: /sbin/drbdadm fence-peer minor-0
drbd0: helper command: /sbin/drbdadm fence-peer minor-0 exit code 2 (0x200)
drbd0: fence-peer helper broken, returned 2
drbd0: Considering state change from bad state. Error would be: 'Refusing to
be Primary while peer is not outdated'
drbd0: old = { cs:NetworkFailure ro:Primary/Unknown ds:UpToDate/DUnknown
s--- }
drbd0: new = { cs:Unconnected ro:Primary/Unknown ds:UpToDate/DUnknown s---
}
drbd0: conn( NetworkFailure -> Unconnected )
drbd0: receiver terminated
drbd0: Restarting receiver thread
drbd0: receiver (re)started
drbd0: Considering state change from bad state. Error would be: 'Refusing to
be Primary while peer is not outdated'
drbd0: old = { cs:Unconnected ro:Primary/Unknown ds:UpToDate/DUnknown s---
}
drbd0: new = { cs:WFConnection ro:Primary/Unknown ds:UpToDate/DUnknown s---
}
drbd0: conn( Unconnected -> WFConnection )
drbd1: PingAck did not arrive in time.
drbd1: peer( Primary -> Unknown ) conn( Connected -> NetworkFailure ) pdsk(
UpToDate -> DUnknown ) susp( 0 -> 1 )
drbd1: asender terminated
drbd1: Terminating asender thread
drbd1: short read expecting header on sock: r=-512
drbd1: Creating new current UUID
drbd1: Connection closed
drbd1: helper command: /sbin/drbdadm fence-peer minor-1
drbd1: helper command: /sbin/drbdadm fence-peer minor-1 exit code 2 (0x200)
drbd1: fence-peer helper broken, returned 2
drbd1: Considering state change from bad state. Error would be: 'Refusing to
be Primary while peer is not outdated'
drbd1: old = { cs:NetworkFailure ro:Primary/Unknown ds:UpToDate/DUnknown
s--- }
drbd1: new = { cs:Unconnected ro:Primary/Unknown ds:UpToDate/DUnknown s---
}
drbd1: conn( NetworkFailure -> Unconnected )
drbd1: receiver terminated
drbd1: Restarting receiver thread
drbd1: receiver (re)started
drbd1: Considering state change from bad state. Error would be: 'Refusing to
be Primary while peer is not outdated'
drbd1: old = { cs:Unconnected ro:Primary/Unknown ds:UpToDate/DUnknown s---
}
drbd1: new = { cs:WFConnection ro:Primary/Unknown ds:UpToDate/DUnknown s---
}
drbd1: conn( Unconnected -> WFConnection )
Fencing is working since the node that failed to send the PinkAck gets
fenced (and rebooted).
However, any ideas why this is happening since there is private link for
DRBD?
The machines are AMD X2 2GHz with 4GB Ram each.
Also I fail to identify on the man pages and the on-line tutorial/manual,
the parameters that will make me fine tune this behavior, so I would also
appreciate some help on that too.
Thank you all for your time.
Theophanis Kontogiannis
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20090316/feab0149/attachment.htm>