[DRBD-user] Impossible to get primary node.
Rob Kramer
rob at solution-space.com
Wed Oct 9 09:19:44 CEST 2019
Thanks a lot for your suggestions (Robert and Lars), it took a while
before I was able to try them on virtual machines. I hope you don't mind
that I reply to both of you in one mail -- I messed up my mail delivery
options (now corrected).
I've added my latest drbd config below, for reference.
>> I can't find any sequence of commands that can convince drbd (or pacemaker) that I *want* to use outdated data.
> This should work:
> drbdadm del-peer tapas:fims1
> drbdadm primary —force tapas
This seems to work (get to UpToDate state) briefly, until the next time
the pacemaker drbd monitor runs, which 'demotes' the resource again to
it's original state.
Failed Resource Actions:
* drbd_monitor_20000 on vmnbiaas2 'master' (8): call=84,
status=complete, exitreason='',
last-rc-change='Wed Oct 9 14:49:06 2019', queued=0ms, exec=0ms
The corosync logs are difficult to follow, so I'm not sure how I can get
pacemaker to accept the trickery done behind its back..
Lars wrote:
Alternatively, you could *add* a suitable fencing constraint to your sole survivor node, which should make the fencing succeed.
You could tell the crm-fence-peer.9.sh fencing handler that an --unreachable-peer-is-outdated.
(Manually. From a root shell. That switch is not effective from within the drbd configuration; for reasons).
I tried this, after finding what the command should look like in
/var/log/messages:
DRBD_BACKING_DEV_0=/dev/mapper/centos-drbd DRBD_CONF=/etc/drbd.conf
DRBD_LL_DISK=/dev/mapper/centos-drbd DRBD_MINOR=0 DRBD_MINOR_0=0
DRBD_MY_ADDRESS=172.17.5.62 DRBD_MY_AF=ipv4 DRBD_MY_NODE_ID=1
DRBD_NODE_ID_0=vmnbiaas1 DRBD_NODE_ID_1=vmnbiaas2
DRBD_PEER_ADDRESS=172.17.5.61 DRBD_PEER_AF=ipv4 DRBD_PEER_NODE_ID=0
DRBD_RESOURCE=tapas DRBD_VOLUME=0 UP_TO_DATE_NODES=0x00000002
/usr/lib/drbd/crm-fence-peer.9.sh --unreachable-peer-is-outdated
This failed as follows:
Oct 9 14:29:42 vmnbiaas2 crm-fence-peer.9.sh[6153]: WARNING Found <cib
crm_feature_set="3.0.14" validate-with="pacemaker-2.10" epoch="48"
num_updates="23" admin_epoch="0" cib-last-written="Wed Oct 9 14:07:22
2019" update-origin="vmnbiaas1" update-client="cibadmin"
update-user="root" have-quorum="0" dc-uuid="1"
Oct 9 14:29:42 vmnbiaas2 crm-fence-peer.9.sh[6153]: WARNING I don't
have quorum; did not place the constraint!
OK, while I'm experimenting, I quick-hacked the script to use
fail_if_no_quorum=false
After which the error changes to
Oct 9 14:38:13 vmnbiaas2 crm-fence-peer.9.sh[7579]: WARNING some peer
is UNCLEAN, my disk is not UpToDate, did not place the constraint!
Cheers!
Rob
resource tapas {
protocol C;
startup {
wfc-timeout 0; ## Infinite!
outdated-wfc-timeout 120;
degr-wfc-timeout 120; ## 2 minutes.
}
disk {
on-io-error detach;
}
handlers {
split-brain "/opt/sol/tapas/bin/split-brain-helper.sh";
fence-peer "/usr/lib/drbd/crm-fence-peer.9.sh";
unfence-peer "/usr/lib/drbd/crm-unfence-peer.9.sh";
}
net {
fencing resource-only;
# after-sb-0pri discard-least-changes;
}
device /dev/drbd0;
disk /dev/mapper/centos-drbd;
meta-disk internal;
on vmnbiaas1 {
address 172.17.5.61:7789;
}
on vmnbiaas2 {
address 172.17.5.62:7789;
}
}
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20191009/b0027872/attachment.htm>
More information about the drbd-user
mailing list