[DRBD-user] [Pacemaker] drbd on heartbeat links

Tue Nov 2 22:25:19 CET 2010

On 2 November 2010 22:07, Pavlos Parissis <pavlos.parissis at gmail.com> wrote:
> On 2 November 2010 16:15, Dan Frincu <dfrincu at streamwide.ro> wrote:
>> Hi,
>>
>> Pavlos Parissis wrote:
>>>
>>> Hi,
>>>
>>> I am trying to figure out how I can resolve the following scenario
>>>
>>> Facts
>>> 3 nodes
>>> 2 DRBD ms resource
>>> 2 group resource
>>> by default drbd1/group1 runs on node-01 and drbd2/group2 runs on node2
>>> drbd1/group1  can only run on node-01 and node-03
>>> drbd2/group2  can only run on node-02 and node-03
>>> DRBD fencing_policy is resource-only [1]
>>> 2 heartbeat links and one of them used by DRBD communication
>>>
>>> Scenario
>>> 1) node-01 loses both heartbeat links
>>> 2) DRBD monitor detects first the absence of the drbd communication
>>> and does resource fencing by add location constraint which prevent
>>> drbd1 to run on node3
>>> 3) pacemaker fencing kicks in and kills node-01
>>>
>>> due to location constraint created at step 2, drbd1/group1 can run in
>>> the cluster
>>>
>>>
>>
>> I don't understand exactly what you mean by this. Resource-only fencing
>> would create a -inf score on node1 when the node loses the drbd
>> communication channel (the only one drbd uses),
> Because node-01 is the primary at the moment of the failure,
> resource-fencing will create an -inf score for the node-03.
>
>> however you could still have
>> heartbeat communication available via the secondary link, then you shouldn't
> As I wrote none of the heartbeat links is available.
> After I sent the mail, I realized that the node-03 will not see
> location constraint created by node-01 because there no heartbeat
> communication!
> Thus I think my scenario has a flaw, since none of the heartbeat links
> are available on node-01.
> Resource-fencing from DRBD will be triggered but without any effect
> and node-03 or node-02 will fence node-01, and node-03 will be become
> the primary for drbd1
>
>> fence the entire node, the resource-only fencing does that for you, the only
>> thing you need to do is to add the drbd fence handlers in /etc/drbd.conf.
>>       handlers {
>>               fence-peer "/usr/lib/drbd/crm-fence-peer.sh";
>>               after-resync-target "/usr/lib/drbd/crm-unfence-peer.sh";
>>       }
>>
>> Is this what you meant?
>
> No.
> Dan thanks for your mail.
>
>
> Since there is a flaw on the scenario let's define a similar scenario.
>
> status
> node-01 primary for drbd1 and group1 runs on it
> node-02 primary for drbd2 and group2 runs on it
> node-3 secondary for drbd1 and drbd2
>
> 2 heartbeat links, and one of them being used for DRBD communication
>
> here is the scenario
> 1) on node-01 heartbeat link which carries also DRBD communication is lost
> 2) node-01 does resource-fencing and places score -inf for drbd1 on node-03
> 3) on node-01 second heartbeat link is lost
> 4) node-01 will be fenced by one other cluster members
> 5) drbd1 can't run on node-03 due to location constraint created at step 2
>
> The problem here is that location constraint will be active even
> node-01 is fenced.
>
> Any ideas?
>

I found this related thread,
http://www.gossamer-threads.com/lists/drbd/users/15380#15380

Wouldn't be better if pacemaker/drbd do these instead? Manual actions
add delay on recovering.

Cheers,
Pavlos