[DRBD-user] Connection problem after hardware change

Yves Glodt yg at mind.lu
Thu Jul 26 20:52:18 CEST 2007

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Hello,

I have a system set up with 2 drbd nodes. Today I changed the hardware 
of the (at that moment) secondary node.

After rebooting it, at some point the primary went into "StandAlone", 
and the new secondary stays at "WFConnection"

When I launch "drbdadm connect all" on the primary, it goes 
to "WfConnection" for one second, and displays this in the log:

Jul 26 20:45:35 drbd1 kernel: drbd0: drbdsetup [7972]: cstate 
StandAlone --> Unconnected
Jul 26 20:45:35 drbd1 kernel: drbd0: drbd0_receiver [7973]: cstate 
Unconnected --> WFConnection
Jul 26 20:45:37 drbd1 kernel: drbd0: drbd0_receiver [7973]: cstate 
WFConnection --> WFReportParams
Jul 26 20:45:37 drbd1 kernel: drbd0: Handshake successful: DRBD Network 
Protocol version 74
Jul 26 20:45:37 drbd1 kernel: drbd0: Connection established.
Jul 26 20:45:37 drbd1 kernel: drbd0: I am(P): 
1:00000010:00000001:00000074:00000010:10
Jul 26 20:45:37 drbd1 kernel: drbd0: Peer(S): 
1:00000011:00000001:0000005f:0000000f:00
Jul 26 20:45:37 drbd1 kernel: drbd0: drbd0_receiver [7973]: cstate 
WFReportParams --> StandAlone
Jul 26 20:45:37 drbd1 kernel: drbd0: asender terminated
Jul 26 20:45:37 drbd1 kernel: drbd0: worker terminated
Jul 26 20:45:37 drbd1 kernel: drbd0: drbd0_receiver [7973]: cstate 
StandAlone --> StandAlone
Jul 26 20:45:37 drbd1 kernel: drbd0: Connection lost.
Jul 26 20:45:37 drbd1 kernel: drbd0: receiver terminated


In the same time, the secondary shows the following:

Jul 26 20:45:37 drbd2 kernel: drbd0: drbd0_receiver [9688]: cstate 
WFConnection --> WFReportParams
Jul 26 20:45:37 drbd2 kernel: drbd0: Handshake successful: DRBD Network 
Protocol version 74
Jul 26 20:45:37 drbd2 kernel: drbd0: Connection established.
Jul 26 20:45:37 drbd2 kernel: drbd0: I am(S): 
1:00000011:00000001:0000005f:0000000f:00
Jul 26 20:45:37 drbd2 kernel: drbd0: Peer(P): 
1:00000010:00000001:00000074:00000010:10
Jul 26 20:45:37 drbd2 kernel: drbd0: drbd0_receiver [9688]: cstate 
WFReportParams --> WFBitMapS
Jul 26 20:45:37 drbd2 kernel: drbd0: drbd0_asender [10619]: cstate 
WFBitMapS --> NetworkFailure
Jul 26 20:45:37 drbd2 kernel: drbd0: asender terminated
Jul 26 20:45:37 drbd2 kernel: drbd0: drbd0_receiver [9688]: cstate 
NetworkFailure --> BrokenPipe
Jul 26 20:45:37 drbd2 kernel: drbd0: Secondary/Unknown --> 
Secondary/Primary
Jul 26 20:45:37 drbd2 kernel: drbd0: sock was shut down by peer
Jul 26 20:45:37 drbd2 kernel: drbd0: drbd0_receiver [9688]: cstate 
BrokenPipe --> BrokenPipe
Jul 26 20:45:37 drbd2 kernel: drbd0: worker terminated
Jul 26 20:45:37 drbd2 kernel: drbd0: drbd0_receiver [9688]: cstate 
BrokenPipe --> Unconnected
Jul 26 20:45:37 drbd2 kernel: drbd0: Connection lost.
Jul 26 20:45:37 drbd2 kernel: drbd0: drbd0_receiver [9688]: cstate 
Unconnected --> WFConnection


The network connection between both nodes is a direct link between 2 
dedicated nics, and that link works, I can ssh and scp happily.
Also, nothing has changed in terms of device names, sda is still sda, 
and eth0 and eth1 are still the same order.
Note that before the hardware change, both nodes ran perfectly...
The systems are debian etch with amd64 xen kernel.

What is going wrong here? :-)

best regards!
yves



-- 
Linux 2.6.20-16-generic #2 SMP Thu Jun 7 20:19:32 UTC 2007 i686
 20:45:19 up  1:42,  1 user,  load average: 0.56, 0.45, 0.38



More information about the drbd-user mailing list