Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hello, Thanks for your reply. Bernd Schubert wrote: > > This should't be neccessary at all. What does your logfiles say? Does it work > if you manually do a /etc/init.d/drdb restart (in very rare case we also have > to do this)? > Rebooting shouldn't be neccessary to basically test the situation. Just > stopping drbd on the master node, making the device primary on the failover > and starting drbd on the master node should be sufficient. > When I do that it works as expected, but when instead I do "reboot" on the primary, it will not automatically connect. Instead the following messages appeary on the new primary once the old one comes up again: Mar 31 15:24:56 ha-beo-1 kernel: drbd0: drbd0_receiver [6076]: cstate WFConnection --> WFReportParams Mar 31 15:24:56 ha-beo-1 kernel: drbd0: Handshake successful: DRBD Network Protocol version 74 Mar 31 15:24:56 ha-beo-1 kernel: drbd0: Connection established. Mar 31 15:24:56 ha-beo-1 kernel: drbd1: drbd1_receiver [6084]: cstate WFConnection --> WFReportParams Mar 31 15:24:56 ha-beo-1 kernel: drbd1: Handshake successful: DRBD Network Protocol version 74 Mar 31 15:24:56 ha-beo-1 kernel: drbd0: I am(P): 1:00000001:00000001:00000024:00000011:10 Mar 31 15:24:56 ha-beo-1 kernel: drbd0: Peer(S): 1:00000001:00000001:00000025:00000010:10 Mar 31 15:24:56 ha-beo-1 kernel: drbd0: Current Primary shall become sync TARGET! Aborting to prevent data corruption. Mar 31 15:24:56 ha-beo-1 kernel: drbd0: drbd0_receiver [6076]: cstate WFReportParams --> StandAlone Mar 31 15:24:56 ha-beo-1 kernel: drbd0: error receiving ReportParams, l: 72! Mar 31 15:24:56 ha-beo-1 kernel: drbd0: worker terminated Mar 31 15:24:56 ha-beo-1 kernel: drbd0: asender terminated Mar 31 15:24:56 ha-beo-1 kernel: drbd0: drbd0_receiver [6076]: cstate StandAlone --> StandAlone Mar 31 15:24:56 ha-beo-1 kernel: drbd0: Connection lost. Mar 31 15:24:56 ha-beo-1 kernel: drbd0: receiver terminated Mar 31 15:24:57 ha-beo-1 kernel: drbd1: Connection established. Mar 31 15:24:57 ha-beo-1 kernel: drbd1: I am(P): 1:00000002:00000001:00000017:0000000c:10 Mar 31 15:24:57 ha-beo-1 kernel: drbd1: Peer(S): 1:00000002:00000001:00000018:0000000b:10 Mar 31 15:24:57 ha-beo-1 kernel: drbd1: Current Primary shall become sync TARGET! Aborting to prevent data corruption. Mar 31 15:24:57 ha-beo-1 kernel: drbd1: drbd1_receiver [6084]: cstate WFReportParams --> StandAlone Mar 31 15:24:57 ha-beo-1 kernel: drbd1: error receiving ReportParams, l: 72! Mar 31 15:24:57 ha-beo-1 kernel: drbd1: worker terminated Mar 31 15:24:57 ha-beo-1 kernel: drbd1: asender terminated Mar 31 15:24:57 ha-beo-1 kernel: drbd1: drbd1_receiver [6084]: cstate StandAlone --> StandAlone Mar 31 15:24:57 ha-beo-1 kernel: drbd1: Connection lost. Mar 31 15:24:57 ha-beo-1 kernel: drbd1: receiver terminated and these are the messages on the old primary: Mar 31 15:24:56 ha-beo-2 kernel: drbd1: drbdsetup [1483]: cstate Unconfigured --> StandAlone Mar 31 15:24:56 ha-beo-2 kernel: drbd0: drbdsetup [1513]: cstate StandAlone --> Unconnected Mar 31 15:24:56 ha-beo-2 kernel: drbd0: drbd0_receiver [1514]: cstate Unconnected --> WFConnection Mar 31 15:24:56 ha-beo-2 kernel: drbd1: drbdsetup [1521]: cstate StandAlone --> Unconnected Mar 31 15:24:56 ha-beo-2 kernel: drbd1: drbd1_receiver [1522]: cstate Unconnected --> WFConnection Mar 31 15:24:56 ha-beo-2 kernel: drbd0: drbd0_receiver [1514]: cstate WFConnection --> WFReportParams Mar 31 15:24:56 ha-beo-2 kernel: drbd0: Handshake successful: DRBD Network Protocol version 74 Mar 31 15:24:56 ha-beo-2 kernel: drbd0: Connection established. Mar 31 15:24:56 ha-beo-2 kernel: drbd1: drbd1_receiver [1522]: cstate WFConnection --> WFReportParams Mar 31 15:24:56 ha-beo-2 kernel: drbd1: Handshake successful: DRBD Network Protocol version 74 Mar 31 15:24:56 ha-beo-2 kernel: drbd0: I am(S): 1:00000001:00000001:00000025:00000010:10 Mar 31 15:24:56 ha-beo-2 kernel: drbd1: Connection established. Mar 31 15:24:56 ha-beo-2 kernel: drbd1: I am(S): 1:00000002:00000001:00000018:0000000b:10 Mar 31 15:24:56 ha-beo-2 kernel: drbd1: Peer(P): 1:00000002:00000001:00000017:0000000c:10 Mar 31 15:24:56 ha-beo-2 kernel: drbd1: drbd1_receiver [1522]: cstate WFReportParams --> WFBitMapS Mar 31 15:24:56 ha-beo-2 kernel: drbd0: Peer(P): 1:00000001:00000001:00000024:00000011:10 Mar 31 15:24:56 ha-beo-2 kernel: drbd0: drbd0_receiver [1514]: cstate WFReportParams --> WFBitMapS Mar 31 15:24:56 ha-beo-2 kernel: drbd0: meta connection shut down by peer. Mar 31 15:24:56 ha-beo-2 kernel: drbd0: asender terminated Mar 31 15:24:56 ha-beo-2 kernel: drbd0: sock_sendmsg returned -32 Mar 31 15:24:56 ha-beo-2 kernel: drbd0: drbd0_receiver [1514]: cstate WFBitMapS --> BrokenPipe Mar 31 15:24:56 ha-beo-2 kernel: drbd0: short sent ReportBitMap size=4096 sent=0 Mar 31 15:24:56 ha-beo-2 kernel: drbd0: Secondary/Unknown --> Secondary/Primary Mar 31 15:24:56 ha-beo-2 kernel: drbd0: sock was shut down by peer Mar 31 15:24:56 ha-beo-2 kernel: drbd0: drbd0_receiver [1514]: cstate BrokenPipe --> BrokenPipe Mar 31 15:24:56 ha-beo-2 kernel: drbd0: short read expecting header on sock: r=0 Mar 31 15:24:56 ha-beo-2 kernel: drbd0: worker terminated Mar 31 15:24:56 ha-beo-2 kernel: drbd0: drbd0_receiver [1514]: cstate BrokenPipe --> Unconnected Mar 31 15:24:56 ha-beo-2 kernel: drbd0: Connection lost. Mar 31 15:24:56 ha-beo-2 kernel: drbd0: drbd0_receiver [1514]: cstate Unconnected --> WFConnection Mar 31 15:24:57 ha-beo-2 kernel: drbd1: meta connection shut down by peer. Mar 31 15:24:57 ha-beo-2 kernel: drbd1: sock_sendmsg returned -104 Mar 31 15:24:57 ha-beo-2 kernel: drbd1: drbd1_receiver [1522]: cstate WFBitMapS --> BrokenPipe Mar 31 15:24:57 ha-beo-2 kernel: drbd1: short sent ReportBitMap size=4096 sent=2472 Mar 31 15:24:57 ha-beo-2 kernel: drbd1: Secondary/Unknown --> Secondary/Primary Mar 31 15:24:57 ha-beo-2 kernel: drbd1: asender terminated Mar 31 15:24:57 ha-beo-2 kernel: drbd1: sock was shut down by peer Mar 31 15:24:57 ha-beo-2 kernel: drbd1: drbd1_receiver [1522]: cstate BrokenPipe --> BrokenPipe Mar 31 15:24:57 ha-beo-2 kernel: drbd1: short read expecting header on sock: r=0 Mar 31 15:24:57 ha-beo-2 kernel: drbd1: worker terminated Mar 31 15:24:57 ha-beo-2 kernel: drbd1: drbd1_receiver [1522]: cstate BrokenPipe --> Unconnected Mar 31 15:24:57 ha-beo-2 kernel: drbd1: Connection lost. Mar 31 15:24:57 ha-beo-2 kernel: drbd1: drbd1_receiver [1522]: cstate Unconnected --> WFConnection I can then do "drbdadm connect all" on the new primary and the sync starts. I understand that the old primary wanted to sync in the wrong direction. This is why I asked if it is the correct procedure to just call "drbdadm connect <resource>" after "primary <resource>"? Or is there any other way to automate this process? Peter