[DRBD-user] DRBD 8.3.0 building general question with make PREFIX=...

Lars Ellenberg lars.ellenberg at linbit.com
Tue Feb 24 17:28:33 CET 2009

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


On Tue, Feb 24, 2009 at 04:51:44PM +0100, GAUTIER Hervé wrote:
> 
> Hi,
> 
> Well, I will try to be clear as possible.
> Sorry if it is not the case.
> It is a bit long.
> 
> On a system (RH 4.2U6), I had a DRBD 8.2.5 built from source, installed
> as following:
> - Sources were in /usr/local/drbd-8.2.5/source/drbd-8.2.5
> - Symbolic link /usr/local/drbd -> /usr/local/drbd-8.2.5
> - Configuration file was /usr/local/drbd-8.2.5/etc/drbd.conf
> - DRBD module was /lib/modules/`uname -r`/kernel/drivers/block/drbd.ko
> - Userland tools were in /sbin
> 
> 
> I was happy with this configuration, but a problem appears with one of
> my tests. I can produce this problem in our labs, not every time but
> enought often.
> So, I said to myself, lets install the last version (8.3.0) in order to
> test if the problem is fixed or not.
> In the same time, I would to do a proper installation to be sure to use
> my DRBD, that is:
> - Sources are in /usr/local/drbd-8.3.0/source/drbd-8.3.0
> - Symbolic link /usr/local/drbd -> /usr/local/drbd-8.3.0
> - Build with "make PREFIX=/usr/local/drbd all
>    && make PREFIX=/usr/local/drbd install".
> - Configuration file is /usr/local/drbd-8.3.0/etc/drbd.conf
>    (same than the 8.2.5 version)
> - DRBD module is now in /usr/local/drbd-8.3.0/\
>    lib/modules/`uname -r`/kernel/drivers/block/drbd.ko
> - Userland tools are now in /usr/local/drbd-8.3.0/sbin
> 
> My first problem was that the drbdadm tool search the new
> /var/lib/drbd//drbd-minor-0.conf file, but this file was in
> /usr/local/drbd-8.3.0/var/lib/drbd directory.
> So, I have modified sources to take in account the make PREFIX=...
> command for this directory. I had posted a patch on the dev mailing list.
> 
> In order to be able to roll back to the 8.2.5 version, I have left the
> userland tools in /sbin and modules in
> /lib/modules/`uname -r`/kernel/drivers/block/drbd.ko
> I have modified my PATH in order to take first the
> /usr/local/drbd-8.3.0/sbin

we never did that.
it probably won't work.
it may have worked in the old days where we did not have any
kernel->userland callbacks.
but it possibly can be made to work, anyways.

> I am positive that the loaded kernel module is the right one, and that
> used userland tools are the right one.
> 
> All was OK, but after some tests, a new problem has appeared, on a "drbd
> connect resource" command:
> -8<---------------------------------------------------------------------------
> Feb 23 10:55:42 rh4-2_1 kernel: drbd0: conn( StandAlone -> Unconnected )
> Feb 23 10:55:42 rh4-2_1 kernel: drbd0: Starting receiver thread (from drbd1_worker [31777])
> Feb 23 10:55:42 rh4-2_1 kernel: drbd0: receiver (re)started
> Feb 23 10:55:42 rh4-2_1 kernel: drbd0: conn( Unconnected -> WFConnection )
> Feb 23 10:55:42 rh4-2_1 kernel: drbd0: Handshake successful: Agreed network protocol version 89
> Feb 23 10:55:42 rh4-2_1 kernel: drbd0: conn( WFConnection -> WFReportParams )
> Feb 23 10:55:42 rh4-2_1 kernel: drbd0: Starting asender thread (from drbd1_receiver [713])
> Feb 23 10:55:42 rh4-2_1 kernel: drbd0: data-integrity-alg: <not-used>
> Feb 23 10:55:42 rh4-2_1 kernel: drbd0: drbd_sync_handshake:
> Feb 23 10:55:42 rh4-2_1 kernel: drbd0: self C0B886A078EDF8CE:0000000000000000:1BA32C69B29346D5:3191C4C20BDF4701
> Feb 23 10:55:42 rh4-2_1 kernel: drbd0: peer EEE9BF7DEF3EEA38:C0B886A078EDF8CE:9B8B7DCBC6A7F5D2:6889940B860C7E9A
> Feb 23 10:55:42 rh4-2_1 kernel: drbd0: uuid_compare()=-1 by rule 5
> Feb 23 10:55:42 rh4-2_1 kernel: drbd0: peer( Unknown -> Secondary ) conn( WFReportParams -> WFBitMapT ) pdsk( DUnknown -> UpToDate )
> Feb 23 10:55:42 rh4-2_1 kernel: drbd0: conn( WFBitMapT -> WFSyncUUID )
> Feb 23 10:55:42 rh4-2_1 kernel: drbd0: helper command: /sbin/drbdadm before-resync-target minor-0
> Feb 23 10:55:42 rh4-2_1 kernel: drbd0: helper command: */sbin/drbdadm before-resync-target minor-0 exit code 3 (0x300)*
> Feb 23 10:55:42 rh4-2_1 kernel: drbd0: before-resync-target handler returned 3, dropping connection.
> Feb 23 10:55:42 rh4-2_1 kernel: drbd0: peer( Secondary -> Unknown ) conn( WFSyncUUID -> Disconnecting ) pdsk( UpToDate -> DUnknown )
> Feb 23 10:55:42 rh4-2_1 kernel: drbd0: asender terminated
> Feb 23 10:55:42 rh4-2_1 kernel: drbd0: Terminating asender thread
> Feb 23 10:55:42 rh4-2_1 kernel: drbd0: Connection closed
> Feb 23 10:55:42 rh4-2_1 kernel: drbd0: conn( Disconnecting -> StandAlone )
> Feb 23 10:55:42 rh4-2_1 kernel: drbd0: receiver terminated
> Feb 23 10:55:42 rh4-2_1 kernel: drbd0: Terminating receiver thread
> -8<---------------------------------------------------------------------------
> 
> Seeing these messages, I thought that the DRBD module was calling the
> wrong drbdadm userland command.

drbd modules calls the usermode_helper program,
which is a module parameter, defaulting to hardcoded /sbin/drbdadm.
you can change that at runtime by
 echo /usr/local/whatever > /sys/module/drbd/parameters/usermode_helper

> So I have renamed previous userland tools and kernel module:
> mv -i /sbin/drbdadm in /sbin/drbdadm-8.5.2
> mv -i /sbin/drbdsetup in /sbin/drbdsetup-8.5.2
> mv -i /sbin/drbdmeta in /sbin/drbdmeta-8.5.2
> mv -i /lib/modules/`uname -r`/kernel/drivers/block/drbd.ko
> /lib/modules/`uname -r`/kernel/drivers/block/drbd-8.2.5.ko
> 
> So now, there is no way to call the old installation.
> I have issued a new "drbd connect resource" command:
> -8<---------------------------------------------------------------------------
> Feb 23 10:59:41 rh4-2_1 kernel: drbd0: conn( StandAlone -> Unconnected )
> Feb 23 10:59:41 rh4-2_1 kernel: drbd0: Starting receiver thread (from drbd1_worker [31777])
> Feb 23 10:59:41 rh4-2_1 kernel: drbd0: receiver (re)started
> Feb 23 10:59:41 rh4-2_1 kernel: drbd0: conn( Unconnected -> WFConnection )
> Feb 23 10:59:42 rh4-2_1 kernel: drbd0: Handshake successful: Agreed network protocol version 89
> Feb 23 10:59:42 rh4-2_1 kernel: drbd0: conn( WFConnection -> WFReportParams )
> Feb 23 10:59:42 rh4-2_1 kernel: drbd0: Starting asender thread (from drbd1_receiver [4357])
> Feb 23 10:59:42 rh4-2_1 kernel: drbd0: data-integrity-alg: <not-used>
> Feb 23 10:59:42 rh4-2_1 kernel: drbd0: drbd_sync_handshake:
> Feb 23 10:59:42 rh4-2_1 kernel: drbd0: self 5B1E45C971132AAA:0000000000000000:1BA32C69B29346D5:3191C4C20BDF4701
> Feb 23 10:59:42 rh4-2_1 kernel: drbd0: peer EEE9BF7DEF3EEA38:5B1E45C971132AAB:C0B886A078EDF8CE:9B8B7DCBC6A7F5D2
> Feb 23 10:59:42 rh4-2_1 kernel: drbd0: uuid_compare()=-1 by rule 5
> Feb 23 10:59:42 rh4-2_1 kernel: drbd0: peer( Unknown -> Secondary ) conn( WFReportParams -> WFBitMapT ) pdsk( DUnknown -> UpToDate )
> Feb 23 10:59:42 rh4-2_1 kernel: drbd0: conn( WFBitMapT -> WFSyncUUID )
> Feb 23 10:59:42 rh4-2_1 kernel: drbd0: helper command: /sbin/drbdadm before-resync-target minor-0
> Feb 23 10:59:42 rh4-2_1 kernel: drbd0: helper command: */sbin/drbdadm before-resync-target minor-0 exit code 0 (0x0)*
> Feb 23 10:59:42 rh4-2_1 kernel: drbd0: conn( WFSyncUUID -> SyncTarget ) disk( UpToDate -> Inconsistent )
> Feb 23 10:59:42 rh4-2_1 kernel: drbd0: Began resync as SyncTarget (will sync 7177172 KB [1794293 bits set]).
> -8<---------------------------------------------------------------------------
> 
> It has worked, the synchronization was OK, but the message gives again
> /sbin/drbdadm, and it is not possible !
> I think that the right /usr/local/drbd-8.3.0/sbin/drbdadm was used,
> else, I don't know how it can find an other drbdadm binary.

apparently, the call_usermode_helper in your kernel version
silently ignores exec failures.

this may correspond to upstream (kernel.org)

  commit 111dbe0c8a21dffa473239861be47ebc87f593b3
  Author: Björn Steinbrink <B.Steinbrink at gmx.de>
  Date:   Fri Sep 29 02:00:46 2006 -0700

    [PATCH] Fix ____call_usermodehelper errors being silently ignored

    If ____call_usermodehelper fails, we're not interested in the child
    process' exit value, but the real error, so let's stop
    wait_for_helper from overwriting it in that case.

    Issue discovered by Benedikt Böhm while working on a Linux-VServer
    usermode helper.

    Signed-off-by: Björn Steinbrink <B.Steinbrink at gmx.de>
    Cc: Rusty Russell <rusty at rustcorp.com.au>
    Signed-off-by: Andrew Morton <akpm at osdl.org>
    Signed-off-by: Linus Torvalds <torvalds at osdl.org>


which happened sometime after 2.6.18,
and may or may not have been backported to vendor kernels.

> -8<---------------------------------------------------------------------------
> Feb 23 11:03:43 rh4-2_1 kernel: drbd0: Resync done (total 241 sec; paused 0 sec; 29780 K/sec)
> Feb 23 11:03:43 rh4-2_1 kernel: drbd0: conn( SyncTarget -> Connected ) disk( Inconsistent -> UpToDate )
> Feb 23 11:03:43 rh4-2_1 kernel: drbd0: helper command: /sbin/drbdadm after-resync-target minor-0
> Feb 23 11:03:43 rh4-2_1 kernel: drbd0: helper command: /sbin/drbdadm after-resync-target minor-0 exit code 0 (0x0)
> -8<---------------------------------------------------------------------------
> 
> And then, since this time, I've got the strange message on drbdadm
> {role|cstate|dstate}:
> 
> -8<---------------------------------------------------------------------------
> # drbdadm -c ../etc/drbd.conf cstate drbd_res0;
> Secondary/Secondary
> # (43)    sync_progress = (integer) 83  [len: 4]
> -8<---------------------------------------------------------------------------

ask your bash which drbdadm it uses.
maybe you need to "hash -r" ?

as I wrote earlier, this is an expected symptom of using the older
drbdadm/drbdsetup against the newer module.

which drbdadm
type drbdadm

maybe do an
strace -e execve -f drbdadm state state drbd_res0

> Is there anything I can do that could help me to understand why I have
> got this message,

don't install into some PREFIX.
we never did that, therefore it is likely to break in funny ways.

> because I have checked a bit the source, but it is not so easy to
> understand them (NL_PACKET(...)...).
> My first feeling is that this is the kernel module which prints the
> message. Is it possible ?

no.

> Hope that was clear.

perfectly.

 ;)

-- 
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com

DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
__
please don't Cc me, but send to list   --   I'm subscribed



More information about the drbd-user mailing list