[DRBD-user] Patched heartbeat 2.1.3 DRBD peer outdater

Lars Ellenberg lars.ellenberg at linbit.com
Fri Dec 5 12:39:15 CET 2008

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


On Thu, Dec 04, 2008 at 12:35:24PM +0100, Ing. Maros TIMKO wrote:
> Hi all,
> 
> running heartbeat-2.1.3-3.el5.centos.x86_64 and drbd82-8.2.6-1.el5.centos.x86_64 on x64 CentOS 5.2.
> I deployed patched DOPD files that I downloaded from linbit.com pages
> some weeks ago. However, when I shut down heartbeat and after that
> DRBD service on one node I can still see the following in the logs:
> Dec  4 11:11:18 vsp11 kernel: drbd0: Requested state change failed by peer: Refusing to be Primary while peer is not outdated
> Dec  4 11:11:18 vsp11 kernel: drbd0: meta connection shut down by peer.
> Dec  4 11:11:18 vsp11 kernel: drbd0: peer( Primary -> Unknown ) conn( Connected -> Disconnecting ) disk( UpToDate -> Outdated ) pdsk( UpToDate -> DUnknown ) 

fine. so Outdating worked.

> Dec  4 11:11:18 vsp11 kernel: drbd0: asender terminated
> Dec  4 11:11:18 vsp11 kernel: drbd0: Terminating asender thread
> Dec  4 11:11:18 vsp11 kernel: drbd0: sock was shut down by peer
> Dec  4 11:11:18 vsp11 kernel: drbd0: short read expecting header on sock: r=0
> Dec  4 11:11:18 vsp11 kernel: drbd0: Writing meta data super block now.
> Dec  4 11:11:18 vsp11 kernel: drbd0: tl_clear()
> Dec  4 11:11:18 vsp11 kernel: drbd0: Connection closed
> Dec  4 11:11:18 vsp11 kernel: drbd0: conn( Disconnecting -> StandAlone ) 
> Dec  4 11:11:18 vsp11 kernel: drbd0: receiver terminated
> Dec  4 11:11:18 vsp11 kernel: drbd0: Terminating receiver thread
> Dec  4 11:11:18 vsp11 kernel: drbd0: disk( Outdated -> Diskless ) 
> Dec  4 11:11:18 vsp11 kernel: drbd0: drbd_bm_resize called with capacity == 0
> Dec  4 11:11:18 vsp11 kernel: drbd0: worker terminated
> Dec  4 11:11:18 vsp11 kernel: drbd0: Terminating worker thread
> Dec  4 11:11:18 vsp11 kernel: drbd1: Requested state change failed by peer: Refusing to be Primary while peer is not outdated
> Dec  4 11:11:18 vsp11 kernel: drbd1: sock was shut down by peer
> Dec  4 11:11:18 vsp11 kernel: drbd1: peer( Primary -> Unknown ) conn( Connected -> BrokenPipe ) pdsk( UpToDate -> DUnknown ) 
> Dec  4 11:11:18 vsp11 kernel: drbd1: short read expecting header on sock: r=0
> Dec  4 11:11:18 vsp11 kernel: drbd1: disk( UpToDate -> Outdated ) 

again, outdating worked.

> Dec  4 11:11:18 vsp11 kernel: drbd1: asender terminated
> Dec  4 11:11:18 vsp11 kernel: drbd1: Terminating asender thread
> Dec  4 11:11:18 vsp11 kernel: drbd1: Writing meta data super block now.
> Dec  4 11:11:18 vsp11 kernel: drbd1: tl_clear()
> Dec  4 11:11:18 vsp11 kernel: drbd1: disk( Outdated -> Diskless ) 
> Dec  4 11:11:18 vsp11 kernel: drbd1: Connection closed
> Dec  4 11:11:18 vsp11 kernel: drbd1: conn( BrokenPipe -> Unconnected ) 
> Dec  4 11:11:18 vsp11 kernel: drbd1: receiver terminated
> Dec  4 11:11:18 vsp11 kernel: drbd1: receiver (re)started
> Dec  4 11:11:18 vsp11 kernel: drbd1: conn( Unconnected -> WFConnection ) 
> Dec  4 11:11:18 vsp11 kernel: drbd2: Requested state change failed by peer: Refusing to be Primary while peer is not outdated
> Dec  4 11:11:18 vsp11 kernel: drbd2: sock was shut down by peer
> Dec  4 11:11:18 vsp11 kernel: drbd2: peer( Primary -> Unknown ) conn( Connected -> BrokenPipe ) pdsk( UpToDate -> DUnknown ) 
> Dec  4 11:11:18 vsp11 kernel: drbd2: short read expecting header on sock: r=0
> Dec  4 11:11:18 vsp11 kernel: drbd2: disk( UpToDate -> Outdated ) 
> Dec  4 11:11:18 vsp11 kernel: drbd2: asender terminated
> Dec  4 11:11:18 vsp11 kernel: drbd2: Terminating asender thread
> Dec  4 11:11:18 vsp11 kernel: drbd2: Writing meta data super block now.
> Dec  4 11:11:18 vsp11 kernel: drbd2: tl_clear()
> Dec  4 11:11:18 vsp11 kernel: drbd2: disk( Outdated -> Diskless ) 
> Dec  4 11:11:18 vsp11 kernel: drbd2: Connection closed
> Dec  4 11:11:18 vsp11 kernel: drbd2: conn( BrokenPipe -> Unconnected ) 
> Dec  4 11:11:18 vsp11 kernel: drbd2: receiver terminated
> Dec  4 11:11:18 vsp11 kernel: drbd2: receiver (re)started
> Dec  4 11:11:18 vsp11 kernel: drbd2: conn( Unconnected -> WFConnection ) 
> Dec  4 11:11:18 vsp11 kernel: drbd3: peer( Secondary -> Unknown ) conn( Connected -> Disconnecting ) pdsk( UpToDate -> DUnknown ) 
> Dec  4 11:11:18 vsp11 kernel: drbd3: Writing meta data super block now.
> Dec  4 11:11:18 vsp11 kernel: drbd3: short read expecting header on sock: r=-512
> Dec  4 11:11:18 vsp11 kernel: drbd3: meta connection shut down by peer.
> Dec  4 11:11:18 vsp11 kernel: drbd3: asender terminated
> Dec  4 11:11:18 vsp11 kernel: drbd3: Terminating asender thread
> Dec  4 11:11:18 vsp11 kernel: drbd3: tl_clear()
> Dec  4 11:11:18 vsp11 kernel: drbd3: Connection closed
> Dec  4 11:11:18 vsp11 kernel: drbd3: conn( Disconnecting -> StandAlone ) 
> Dec  4 11:11:18 vsp11 kernel: drbd3: receiver terminated
> Dec  4 11:11:18 vsp11 kernel: drbd3: Terminating receiver thread
> Dec  4 11:11:18 vsp11 kernel: drbd3: disk( UpToDate -> Diskless ) 
> Dec  4 11:11:18 vsp11 kernel: drbd3: drbd_bm_resize called with capacity == 0
> Dec  4 11:11:18 vsp11 kernel: drbd3: worker terminated
> Dec  4 11:11:18 vsp11 kernel: drbd3: Terminating worker thread
> ...
> 
> Interesting is that we have 7 DRBD resources. All of them (after HA
> and DRBD shutdown) become Unconfigured but drbd1 and drbd2 - they are
> Connected, Secondary/Primary, Diskless/UpToDate. You can see "receiver
> (re)started" for them in the log.
> As linbit has removed the patches from their pages, I am curious
> whether this patch is the solution or not.  I tried to browse
> internet, there is no definite question.

right.
I removed those yesterday, because heartbeat 2.1.3 has been superseeded by
the heartbeat 2.1.4 release, which includes these exact patches.

the log above doe not indicate any problem with dopd at all.
I think your dopd is just fine.
whatever you are seeing probably has nothing to do with dopd.

the log from the peer for the same events may be helpful,
also the syslog may help (both nodes, for the relevant events)
or wherever heartbeat (and therefore dopd) logs go.

> I also found heartbeat 2.1.4 for CentOS taht should include this patch on:
> http://download.opensuse.org
> 
> Does anyone has an experience with 2.1.3 patched DOPD or 2.1.4 version?

the 2.1.4 contains _exactly_ those patches.

> PS: I also can see following in the log file: Dec  4 11:11:18 vsp11
> kernel: drbd5: ASSERT( mdev->receiver.t_state == None ) in
> /home/buildsvn/rpmbuild/BUILD/drbd-8.2.6/_kmod_build_xen/drbd/drbd_main.c:2412
> Does it actually mean anything?

yes.
you should upgrade to 8.2.7,
and try again.

-- 
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com

DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
__
please don't Cc me, but send to list   --   I'm subscribed



More information about the drbd-user mailing list