Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hello, I have a system set up with 2 drbd nodes. Today I changed the hardware of the (at that moment) secondary node. After rebooting it, at some point the primary went into "StandAlone", and the new secondary stays at "WFConnection" When I launch "drbdadm connect all" on the primary, it goes to "WfConnection" for one second, and displays this in the log: Jul 26 20:45:35 drbd1 kernel: drbd0: drbdsetup [7972]: cstate StandAlone --> Unconnected Jul 26 20:45:35 drbd1 kernel: drbd0: drbd0_receiver [7973]: cstate Unconnected --> WFConnection Jul 26 20:45:37 drbd1 kernel: drbd0: drbd0_receiver [7973]: cstate WFConnection --> WFReportParams Jul 26 20:45:37 drbd1 kernel: drbd0: Handshake successful: DRBD Network Protocol version 74 Jul 26 20:45:37 drbd1 kernel: drbd0: Connection established. Jul 26 20:45:37 drbd1 kernel: drbd0: I am(P): 1:00000010:00000001:00000074:00000010:10 Jul 26 20:45:37 drbd1 kernel: drbd0: Peer(S): 1:00000011:00000001:0000005f:0000000f:00 Jul 26 20:45:37 drbd1 kernel: drbd0: drbd0_receiver [7973]: cstate WFReportParams --> StandAlone Jul 26 20:45:37 drbd1 kernel: drbd0: asender terminated Jul 26 20:45:37 drbd1 kernel: drbd0: worker terminated Jul 26 20:45:37 drbd1 kernel: drbd0: drbd0_receiver [7973]: cstate StandAlone --> StandAlone Jul 26 20:45:37 drbd1 kernel: drbd0: Connection lost. Jul 26 20:45:37 drbd1 kernel: drbd0: receiver terminated In the same time, the secondary shows the following: Jul 26 20:45:37 drbd2 kernel: drbd0: drbd0_receiver [9688]: cstate WFConnection --> WFReportParams Jul 26 20:45:37 drbd2 kernel: drbd0: Handshake successful: DRBD Network Protocol version 74 Jul 26 20:45:37 drbd2 kernel: drbd0: Connection established. Jul 26 20:45:37 drbd2 kernel: drbd0: I am(S): 1:00000011:00000001:0000005f:0000000f:00 Jul 26 20:45:37 drbd2 kernel: drbd0: Peer(P): 1:00000010:00000001:00000074:00000010:10 Jul 26 20:45:37 drbd2 kernel: drbd0: drbd0_receiver [9688]: cstate WFReportParams --> WFBitMapS Jul 26 20:45:37 drbd2 kernel: drbd0: drbd0_asender [10619]: cstate WFBitMapS --> NetworkFailure Jul 26 20:45:37 drbd2 kernel: drbd0: asender terminated Jul 26 20:45:37 drbd2 kernel: drbd0: drbd0_receiver [9688]: cstate NetworkFailure --> BrokenPipe Jul 26 20:45:37 drbd2 kernel: drbd0: Secondary/Unknown --> Secondary/Primary Jul 26 20:45:37 drbd2 kernel: drbd0: sock was shut down by peer Jul 26 20:45:37 drbd2 kernel: drbd0: drbd0_receiver [9688]: cstate BrokenPipe --> BrokenPipe Jul 26 20:45:37 drbd2 kernel: drbd0: worker terminated Jul 26 20:45:37 drbd2 kernel: drbd0: drbd0_receiver [9688]: cstate BrokenPipe --> Unconnected Jul 26 20:45:37 drbd2 kernel: drbd0: Connection lost. Jul 26 20:45:37 drbd2 kernel: drbd0: drbd0_receiver [9688]: cstate Unconnected --> WFConnection The network connection between both nodes is a direct link between 2 dedicated nics, and that link works, I can ssh and scp happily. Also, nothing has changed in terms of device names, sda is still sda, and eth0 and eth1 are still the same order. Note that before the hardware change, both nodes ran perfectly... The systems are debian etch with amd64 xen kernel. What is going wrong here? :-) best regards! yves -- Linux 2.6.20-16-generic #2 SMP Thu Jun 7 20:19:32 UTC 2007 i686 20:45:19 up 1:42, 1 user, load average: 0.56, 0.45, 0.38