[DRBD-user] Connection problem after hardware change
Yves Glodt
yg at mind.lu
Thu Jul 26 20:52:18 CEST 2007
Hello,
I have a system set up with 2 drbd nodes. Today I changed the hardware
of the (at that moment) secondary node.
After rebooting it, at some point the primary went into "StandAlone",
and the new secondary stays at "WFConnection"
When I launch "drbdadm connect all" on the primary, it goes
to "WfConnection" for one second, and displays this in the log:
Jul 26 20:45:35 drbd1 kernel: drbd0: drbdsetup [7972]: cstate
StandAlone --> Unconnected
Jul 26 20:45:35 drbd1 kernel: drbd0: drbd0_receiver [7973]: cstate
Unconnected --> WFConnection
Jul 26 20:45:37 drbd1 kernel: drbd0: drbd0_receiver [7973]: cstate
WFConnection --> WFReportParams
Jul 26 20:45:37 drbd1 kernel: drbd0: Handshake successful: DRBD Network
Protocol version 74
Jul 26 20:45:37 drbd1 kernel: drbd0: Connection established.
Jul 26 20:45:37 drbd1 kernel: drbd0: I am(P):
1:00000010:00000001:00000074:00000010:10
Jul 26 20:45:37 drbd1 kernel: drbd0: Peer(S):
1:00000011:00000001:0000005f:0000000f:00
Jul 26 20:45:37 drbd1 kernel: drbd0: drbd0_receiver [7973]: cstate
WFReportParams --> StandAlone
Jul 26 20:45:37 drbd1 kernel: drbd0: asender terminated
Jul 26 20:45:37 drbd1 kernel: drbd0: worker terminated
Jul 26 20:45:37 drbd1 kernel: drbd0: drbd0_receiver [7973]: cstate
StandAlone --> StandAlone
Jul 26 20:45:37 drbd1 kernel: drbd0: Connection lost.
Jul 26 20:45:37 drbd1 kernel: drbd0: receiver terminated
In the same time, the secondary shows the following:
Jul 26 20:45:37 drbd2 kernel: drbd0: drbd0_receiver [9688]: cstate
WFConnection --> WFReportParams
Jul 26 20:45:37 drbd2 kernel: drbd0: Handshake successful: DRBD Network
Protocol version 74
Jul 26 20:45:37 drbd2 kernel: drbd0: Connection established.
Jul 26 20:45:37 drbd2 kernel: drbd0: I am(S):
1:00000011:00000001:0000005f:0000000f:00
Jul 26 20:45:37 drbd2 kernel: drbd0: Peer(P):
1:00000010:00000001:00000074:00000010:10
Jul 26 20:45:37 drbd2 kernel: drbd0: drbd0_receiver [9688]: cstate
WFReportParams --> WFBitMapS
Jul 26 20:45:37 drbd2 kernel: drbd0: drbd0_asender [10619]: cstate
WFBitMapS --> NetworkFailure
Jul 26 20:45:37 drbd2 kernel: drbd0: asender terminated
Jul 26 20:45:37 drbd2 kernel: drbd0: drbd0_receiver [9688]: cstate
NetworkFailure --> BrokenPipe
Jul 26 20:45:37 drbd2 kernel: drbd0: Secondary/Unknown -->
Secondary/Primary
Jul 26 20:45:37 drbd2 kernel: drbd0: sock was shut down by peer
Jul 26 20:45:37 drbd2 kernel: drbd0: drbd0_receiver [9688]: cstate
BrokenPipe --> BrokenPipe
Jul 26 20:45:37 drbd2 kernel: drbd0: worker terminated
Jul 26 20:45:37 drbd2 kernel: drbd0: drbd0_receiver [9688]: cstate
BrokenPipe --> Unconnected
Jul 26 20:45:37 drbd2 kernel: drbd0: Connection lost.
Jul 26 20:45:37 drbd2 kernel: drbd0: drbd0_receiver [9688]: cstate
Unconnected --> WFConnection
The network connection between both nodes is a direct link between 2
dedicated nics, and that link works, I can ssh and scp happily.
Also, nothing has changed in terms of device names, sda is still sda,
and eth0 and eth1 are still the same order.
Note that before the hardware change, both nodes ran perfectly...
The systems are debian etch with amd64 xen kernel.
What is going wrong here? :-)
best regards!
yves
--
Linux 2.6.20-16-generic #2 SMP Thu Jun 7 20:19:32 UTC 2007 i686
20:45:19 up 1:42, 1 user, load average: 0.56, 0.45, 0.38
More information about the drbd-user
mailing list