Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hi, can you help me with this? I can't figure it out why it goes "StandAlone". Regards, Pedro Sousa On Thu, May 28, 2009 at 6:49 PM, Pedro Sousa <pgsousa at gmail.com> wrote: > Can you check it please? > > May 27 19:38:35 ha2 heartbeat: [2426]: ERROR: glib: Unable to send bcast > [-1] packet(len=217): No such device > May 27 19:38:35 ha2 heartbeat: [2426]: ERROR: write_child: write failure on > bcast eth1.: No such device > May 27 19:38:37 ha2 heartbeat: [2426]: ERROR: glib: Unable to send bcast > [-1] packet(len=217): No such device > May 27 19:38:37 ha2 heartbeat: [2426]: ERROR: write_child: write failure on > bcast eth1.: No such device > May 27 19:38:38 ha2 kernel: drbd0: PingAck did not arrive in time. > May 27 19:38:38 ha2 kernel: drbd0: peer( Primary -> Unknown ) conn( > Connected -> NetworkFailure ) pdsk( UpToDate -> DUnknown ) > May 27 19:38:38 ha2 kernel: drbd0: asender terminated > May 27 19:38:38 ha2 kernel: drbd0: Terminating asender thread > May 27 19:38:38 ha2 kernel: drbd0: short read expecting header on sock: > r=-512 > May 27 19:38:38 ha2 kernel: drbd0: Writing meta data super block now. > May 27 19:38:38 ha2 kernel: drbd0: tl_clear() > May 27 19:38:38 ha2 kernel: drbd0: Connection closed > May 27 19:38:38 ha2 kernel: drbd0: conn( NetworkFailure -> Unconnected ) > May 27 19:38:38 ha2 kernel: drbd0: receiver terminated > May 27 19:38:38 ha2 kernel: drbd0: receiver (re)started > May 27 19:38:38 ha2 kernel: drbd0: conn( Unconnected -> WFConnection ) > May 27 19:38:38 ha2 kernel: drbd0: Unable to bind source sock (-99) > May 27 19:38:38 ha2 last message repeated 2 times > May 27 19:38:38 ha2 kernel: drbd0: Unable to bind sock2 (-99) > May 27 19:38:38 ha2 kernel: drbd0: conn( WFConnection -> Disconnecting ) > May 27 19:38:38 ha2 kernel: drbd0: Discarding network configuration. > May 27 19:38:38 ha2 kernel: drbd0: tl_clear() > May 27 19:38:38 ha2 kernel: drbd0: Connection closed > May 27 19:38:38 ha2 kernel: drbd0: conn( Disconnecting -> StandAlone ) > May 27 19:38:38 ha2 kernel: drbd0: receiver terminated > May 27 19:38:38 ha2 kernel: drbd0: Terminating receiver thread > May 27 19:38:39 ha2 heartbeat: [2426]: ERROR: glib: Unable to send bcast > [-1] packet(len=217): No such device > May 27 19:38:39 ha2 heartbeat: [2426]: ERROR: write_child: write failure on > bcast eth1.: No such device > May 27 19:38:40 ha2 kernel: drbd0: disk( UpToDate -> Outdated ) > May 27 19:38:40 ha2 kernel: drbd0: Writing meta data super block now. > May 27 19:38:40 ha2 /usr/lib/heartbeat/dopd: [2513]: info: sending return > code: 4, ha2.teste.local -> ha1.teste.local > May 27 19:38:41 ha2 heartbeat: [2426]: ERROR: glib: Unable to send bcast > [-1] packet(len=310): No such device > May 27 19:38:41 ha2 heartbeat: [2426]: ERROR: write_child: write failure on > bcast eth1.: No such device > May 27 19:38:41 ha2 heartbeat: [2426]: ERROR: glib: Unable to send bcast > [-1] packet(len=217): No such device > May 27 19:38:41 ha2 heartbeat: [2426]: ERROR: write_child: write failure on > bcast eth1.: No such device > May 27 19:38:43 ha2 heartbeat: [2426]: ERROR: glib: Unable to send bcast > [-1] packet(len=217): No such device > May 27 19:38:43 ha2 heartbeat: [2426]: ERROR: write_child: write failure on > bcast eth1.: No such device > May 27 19:38:45 ha2 heartbeat: [2408]: info: Link ha1.teste.local:eth1 > dead. > May 27 19:38:45 ha2 ipfail: [2514]: info: Link Status update: Link > ha1.teste.local/eth1 now has status dead > May 27 19:38:45 ha2 heartbeat: [2426]: ERROR: glib: Unable to send bcast > [-1] packet(len=217): No such device > May 27 19:38:45 ha2 heartbeat: [2426]: ERROR: write_child: write failure on > bcast eth1.: No such device > May 27 19:38:46 ha2 ipfail: [2514]: info: Asking other side for ping node > count. > May 27 19:38:46 ha2 ipfail: [2514]: info: Checking remote count of ping > nodes. > May 27 19:38:46 ha2 heartbeat: [2426]: ERROR: glib: Unable to send bcast > [-1] packet(len=223): No such device > May 27 19:38:46 ha2 heartbeat: [2426]: ERROR: write_child: write failure on > bcast eth1.: No such device > May 27 19:38:46 ha2 heartbeat: [2426]: WARN: Temporarily Suppressing write > error messages > May 27 19:38:46 ha2 heartbeat: [2426]: WARN: Is a cable unplugged on bcast > eth1? > May 27 19:38:47 ha2 ipfail: [2514]: info: Ping node count is balanced. > May 27 19:38:48 ha2 ipfail: [2514]: info: No giveup timer to abort. > May 27 19:39:06 ha2 kernel: eth1: link up > > Regards, > Pedro Sousa > > > > > On Thu, May 28, 2009 at 4:51 PM, Lars Ellenberg <lars.ellenberg at linbit.com > > wrote: > >> On Thu, May 28, 2009 at 01:46:43PM +0100, Pedro Sousa wrote: >> > Hi, >> > >> > I'm testing split-brain in a master/slave scenario with dopd and have >> some >> > doubts about the automatic recovery procedure. The steps I took were: >> > >> > 1º Unplug the crossover cable >> > >> > Master: >> > >> > Primary/Unknown ds:UpToDate/Outdated >> > >> > Slave: >> > >> > StandAlone ro:Secondary/Unknown ds:Consistent/DUnknown >> > >> > 2º Plug the cable back on: >> > >> > Both nodes remain with the same state: Update/Outdated and >> > Consistent/Unknown >> > >> > My question is: shouldn't the slave rejoin/resync to the master >> > automatically after I plug the cable? >> > >> > I have to manually run: "drbdadm adjust all" to recover it. >> >> once a node reaches "StandAlone", >> you have to tell it to try and reconnect, yes. >> >> so this is how it is supposed to be. >> >> why it goes to "StandAlone" should be in the logs. >> >> > My conf (centos 5.3; drbd 8.3.1; heartbeat 2.99) >> > >> > /etc/drbd.conf >> >> </snip> >> >> >> -- >> : Lars Ellenberg >> : LINBIT | Your Way to High Availability >> : DRBD/HA support and consulting http://www.linbit.com >> >> DRBD® and LINBIT® are registered trademarks of LINBIT, Austria. >> __ >> please don't Cc me, but send to list -- I'm subscribed >> _______________________________________________ >> drbd-user mailing list >> drbd-user at lists.linbit.com >> http://lists.linbit.com/mailman/listinfo/drbd-user >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20090603/8d4168ac/attachment.htm>