Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hello,
I recently migrated our test cluster to a Vmware env, both nodes are
virtual guest with a virtual lan for the drbd.
After some days running without problem, the resource on the both node
become "StandAlone" ( I suppose the problem is due to a very high load
of the Vmware host), but now I can not "connect" the resources.
When I try to " connect" the drbd0 resources I have on one node these
messages in the "/var/log/messages",
Jan 28 14:00:55 noeud1 kernel: drbd0: conn( StandAlone -> Unconnected )
Jan 28 14:00:55 noeud1 kernel: drbd0: receiver (re)started
Jan 28 14:00:55 noeud1 kernel: drbd0: conn( Unconnected -> WFConnection )
Jan 28 14:00:56 noeud1 kernel: drbd0: Handshake successful: Agreed
network protocol version 88
Jan 28 14:00:56 noeud1 kernel: drbd0: Peer authenticated using 64 bytes
of 'sha512' HMAC
Jan 28 14:00:56 noeud1 kernel: drbd0: conn( WFConnection -> WFReportParams )
Jan 28 14:00:56 noeud1 kernel: drbd0: data-integrity-alg: <not-used>
Jan 28 14:00:57 noeud1 kernel: drbd0: Split-Brain detected, dropping
connection!
Jan 28 14:00:57 noeud1 kernel: drbd0: self
B3D5FA91510F45DE:D289EE2D946147A1:177216482D86F5DE:332BCB37708E13F1
Jan 28 14:00:57 noeud1 kernel: drbd0: peer
3850AFF7C480B265:D289EE2D946147A1:177216482D86F5DE:332BCB37708E13F1
Jan 28 14:00:57 noeud1 kernel: drbd0: conn( WFReportParams ->
Disconnecting )
Jan 28 14:00:57 noeud1 kernel: drbd0: helper command: /sbin/drbdadm
split-brain
Jan 28 14:00:57 noeud1 kernel: drbd0: meta connection shut down by peer.
Jan 28 14:00:57 noeud1 kernel: drbd0: asender terminated
Jan 28 14:00:57 noeud1 kernel: drbd0: error receiving ReportState, l: 4!
Jan 28 14:00:57 noeud1 kernel: drbd0: tl_clear()
Jan 28 14:00:57 noeud1 kernel: drbd0: Connection closed
Jan 28 14:00:57 noeud1 kernel: drbd0: conn( Disconnecting -> StandAlone )
Jan 28 14:00:57 noeud1 kernel: drbd0: receiver terminated
On the other node I have these messages in the "/var/log/messages".
Jan 28 14:00:40 noeud2 kernel: drbd0: conn( StandAlone -> Unconnected )
Jan 28 14:00:40 noeud2 kernel: drbd0: receiver (re)started
Jan 28 14:00:40 noeud2 kernel: drbd0: conn( Unconnected -> WFConnection )
Jan 28 14:00:53 noeud2 kernel: drbd0: Handshake successful: Agreed
network protocol version 88
Jan 28 14:00:53 noeud2 kernel: drbd0: Peer authenticated using 64 bytes
of 'sha512' HMAC
Jan 28 14:00:54 noeud2 kernel: drbd0: conn( WFConnection -> WFReportParams )
Jan 28 14:00:54 noeud2 kernel: drbd0: data-integrity-alg: <not-used>
Jan 28 14:00:54 noeud2 kernel: drbd0: Split-Brain detected, dropping
connection!
Jan 28 14:00:54 noeud2 kernel: drbd0: self
3850AFF7C480B265:D289EE2D946147A1:177216482D86F5DE:332BCB37708E13F1
Jan 28 14:00:54 noeud2 kernel: drbd0: peer
B3D5FA91510F45DE:D289EE2D946147A1:177216482D86F5DE:332BCB37708E13F1
Jan 28 14:00:54 noeud2 kernel: drbd0: conn( WFReportParams ->
Disconnecting )
Jan 28 14:00:54 noeud2 kernel: drbd0: helper command: /sbin/drbdadm
split-brain
Jan 28 14:00:54 noeud2 kernel: drbd0: error receiving ReportState, l: 4!
Jan 28 14:00:54 noeud2 kernel: drbd0: asender terminated
Jan 28 14:00:54 noeud2 kernel: drbd0: tl_clear()
Jan 28 14:00:54 noeud2 kernel: drbd0: Connection closed
Jan 28 14:00:54 noeud2 kernel: drbd0: conn( Disconnecting -> StandAlone )
Jan 28 14:00:54 noeud2 kernel: drbd0: receiver terminated
After a DRBD stop on each node I have these DRBD status.
[root at noeud1 ~]# cat /proc/drbd
version: 8.2.4 (api:88/proto:86-88)
GIT-hash: fc00c6e00a1b6039bfcebe37afa3e7e28dbd92fa build by
root at francis.apec.fr, 2008-01-11 16:59:43
0: cs:StandAlone st:Secondary/Unknown ds:UpToDate/DUnknown r---
ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0
resync: used:0/31 hits:0 misses:0 starving:0 dirty:0 changed:0
act_log: used:0/257 hits:0 misses:0 starving:0 dirty:0 changed:0
1: cs:StandAlone st:Secondary/Unknown ds:UpToDate/DUnknown r---
ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0
resync: used:0/31 hits:0 misses:0 starving:0 dirty:0 changed:0
act_log: used:0/257 hits:0 misses:0 starving:0 dirty:0 changed:0
2: cs:StandAlone st:Secondary/Unknown ds:UpToDate/DUnknown r---
ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0
resync: used:0/31 hits:0 misses:0 starving:0 dirty:0 changed:0
act_log: used:0/257 hits:0 misses:0 starving:0 dirty:0 changed:0
3: cs:StandAlone st:Secondary/Unknown ds:UpToDate/DUnknown r---
ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0
resync: used:0/31 hits:0 misses:0 starving:0 dirty:0 changed:0
act_log: used:0/257 hits:0 misses:0 starving:0 dirty:0 changed:0
root at noeud2 ~]# cat /proc/drbd
version: 8.2.4 (api:88/proto:86-88)
GIT-hash: fc00c6e00a1b6039bfcebe37afa3e7e28dbd92fa build by
root at francis.apec.fr, 2008-01-11 16:59:43
0: cs:StandAlone st:Secondary/Unknown ds:UpToDate/DUnknown r---
ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0
resync: used:0/31 hits:0 misses:0 starving:0 dirty:0 changed:0
act_log: used:0/257 hits:0 misses:0 starving:0 dirty:0 changed:0
1: cs:StandAlone st:Secondary/Unknown ds:UpToDate/DUnknown r---
ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0
resync: used:0/31 hits:0 misses:0 starving:0 dirty:0 changed:0
act_log: used:0/257 hits:0 misses:0 starving:0 dirty:0 changed:0
2: cs:StandAlone st:Secondary/Unknown ds:UpToDate/DUnknown r---
ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0
resync: used:0/31 hits:0 misses:0 starving:0 dirty:0 changed:0
act_log: used:0/257 hits:0 misses:0 starving:0 dirty:0 changed:0
3: cs:StandAlone st:Secondary/Unknown ds:UpToDate/DUnknown r---
ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0
resync: used:0/31 hits:0 misses:0 starving:0 dirty:0 changed:0
act_log: used:0/257 hits:0 misses:0 starving:0 dirty:0 changed:0
Best regards.
Francis