[DRBD-user] operation monitor failed 'not configured' - how to tell what's not configured?

Wed Sep 24 18:24:54 CEST 2014

On 09/24/2014 04:31 PM, Klint Gore wrote:
>> From: drbd-user-bounces at lists.linbit.com [drbd-user-bounces at lists.linbit.com] On Behalf Of Lars Ellenberg [lars.ellenberg at linbit.com]
>> On Wed, Sep 24, 2014 at 10:17:34AM +1000, Klint Gore wrote:
>>> [root at hans0 ~]# yum list installed |grep -E "(coro|pacemaker|drbd)"
>>> corosync.x86_64                       2.3.3-2.el7                      @base
>>> corosynclib.x86_64                    2.3.3-2.el7                      @base
>>> drbd84-utils.x86_64                   8.9.1-1.el7.elrepo               @elrepo
>>
>> In that case, it likely is "not installed", even.
>>
>> "drbd*pacemaker" would contain our ocf agent script:
>> /usr/lib/ocf/resource.d/linbit/drbd
>>
>
> Looks like it exists.  Same file exists on both nodes (md5 matches).  Is there a way to tell what version it is?  Should there be other files as well?
>
> [root at hans0 linbit]# pwd
> /usr/lib/ocf/resource.d/linbit
> [root at hans0 linbit]# ll
> total 36
> -rwxr-xr-x. 1 root root 33261 Aug 18 12:48 drbd
> [root at hans0 linbit]# head drbd
> #!/bin/bash
> #
> #
> #               OCF Resource Agent compliant drbd resource script.
> #
> # Copyright (c) 2009 LINBIT HA-Solutions GmbH,
> # Copyright (c) 2009 Florian Haas, Lars Ellenberg
> # Based on the Heartbeat drbd OCF Resource Agent by Lars Marowsky-Bree
> # (though it turned out to be an almost complete rewrite)
> #
> [root at hans0 linbit]# md5sum drbd
> 0b95f50c91bd12744ec204d4f7849b12  drbd
> _______________________________________________
> drbd-user mailing list
> drbd-user at lists.linbit.com
> http://lists.linbit.com/mailman/listinfo/drbd-user
>

it may sound obvious, but did you try to start your cluster resources 
manually without pacemaker running ? (up resource, set primary, mount, ...)

if it works manually, do you have any error in pacemaker's log files ? 
("journalctl | grep -i error")

if there's nothing obvious in the log files, you can add the following 
line in the ocf file on the DC:

set > /tmp/blah

that way you'll be able to find out what OCF_* variables are used and 
you could then run the script manually with those variables exported and 
see what's happening (don't do that on a production cluster).

a side note: elrepo packages drbd-utils into a single rpm, while fedora 
(or you, should you build the rpm yourself) splits the functionality in 
several rpms, eg.

drbd-udev
drbd-utils
drbd-pacemaker
...

here drbd-pacemaker provides the ocf resource, as well as the 
stonith/fence scripts.

I have migrated a centos6/cman/pacemaker cluster to 
centos7/corosync2/pacemaker and I don't have any problem except a weird 
path issue when running more "low-level" drbd commands (eg. when 
manually recovering from a split brain) - I'll probably file a bug when 
I have time to test if the problem is still in the last rc.

ivan