Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
/ 2006-05-19 12:30:01 +0200 \ Wim Ceulemans: > Hi > > According to the documentation drbd should never go into StandAlone mode unless for some serious problems. From the Changelog of drbd the > latest issues concerning going into StandAlone mode where fixed in 0.7.12 and we are using version 0.7.14. > > So my question, in rebooting the primary node of an ha cluster, drbd got into the StandAlone state for the reason as explained below. > This is what is happening on the primary: > > 03:58:10 kernel drbd0: Primary/Secondary --> Secondary/Secondary > 03:58:10 kernel drbd0: drbdsetup [4377]: cstate Connected --> Unconnected > 03:58:10 kernel drbd0: drbd0_receiver [1529]: cstate Unconnected --> BrokenPipe > 03:58:10 kernel drbd0: short read expecting header on sock: r=-512 > 03:58:10 kernel drbd0: asender terminated > 03:58:10 kernel drbd0: worker terminated > 03:58:10 kernel drbd0: drbd0_receiver [1529]: cstate BrokenPipe --> StandAlone > 03:58:10 kernel drbd0: Connection lost. > 03:58:10 kernel drbd0: receiver terminated > 03:58:10 kernel drbd0: drbdsetup [4377]: cstate StandAlone --> StandAlone > 03:58:10 kernel drbd0: drbdsetup [4377]: cstate StandAlone --> Unconfigured > 03:58:10 kernel drbd0: worker terminated > 03:58:10 kernel drbd: module cleanup done. this is a "drbdsetup /dev/drbd0", an administrative tear down request. so what exactly are you complaining about here? > And this is what is happening on the secondary: > > 03:58:27 kernel drbd0: Secondary/Primary --> Secondary/Secondary > 03:58:27 kernel drbd0: meta connection shut down by peer. 03:58:27 kernel drbd0: drbd0_asender [1481]: cstate Connected --> > NetworkFailure > 03:58:27 kernel drbd0: asender terminated > 03:58:27 kernel drbd0: drbd0_receiver [1466]: cstate NetworkFailure --> BrokenPipe > 03:58:27 kernel drbd0: short read expecting header on sock: r=-512 > 03:58:27 kernel drbd0: worker terminated > 03:58:27 kernel drbd0: drbd0_receiver [1466]: cstate BrokenPipe --> Unconnected > 03:58:27 kernel drbd0: Connection lost. > 03:58:27 kernel drbd0: drbd0_receiver [1466]: cstate Unconnected --> WFConnection > 03:58:28 kernel drbd0: Secondary/Unknown --> Primary/Unknown > 03:58:37 kernel EXT3 FS 2.4-0.9.19, 19 August 2002 on drbd(147,0), internal journal > 03:58:37 kernel drbd0: Unable to bind source sock (-99) > 03:58:37 kernel drbd0: Unable to bind sock2 (-99) the IP you configured drbd to use is not available. sorry, without ip we cannot connect. > 03:58:37 kernel drbd0: drbd0_receiver [1466]: cstate WFConnection --> Unconnected > 03:58:37 kernel drbd0: worker terminated > 03:58:37 kernel drbd0: drbd0_receiver [1466]: cstate Unconnected --> Unconnected > 03:58:37 kernel drbd0: Connection lost. > 03:58:37 kernel drbd0: Discarding network configuration. > 03:58:37 kernel drbd0: drbd0_receiver [1466]: cstate Unconnected --> StandAlone > 03:58:37 kernel drbd0: receiver terminated > > After further investigating I have found why drbd goes standalone. > Somewhere on the server the ip addresses on the interface are temporarily flushed and re-added by a 'ip route flush dev eth0', followed > by several 'ip addr add' commands . If at that moment drbd tries to sync a packet it considers this as a serious network error and goes > in standalone mode. > > So, my question, is there a possibility to disable drbd going in standalone mode when it encounters serious network trouble? you could try and hack some "is this ip suppsed to be used by drbd?" "yes -> is drbd StandAlone on that connection?" "yes -> is there some admin lock file that says 'disable this automagic'?" "no -> drbdadm connect whatever" in some ip-up script of your flaky nics... better yet: find that "Somewhere on the server" that is causing the trouble in the first place, and fix it there. -- : Lars Ellenberg Tel +43-1-8178292-0 : : LINBIT Information Technologies GmbH Fax +43-1-8178292-82 : : Schoenbrunner Str. 244, A-1120 Vienna/Europe http://www.linbit.com : __ please use the "List-Reply" function of your email client.