Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Easiest thing to do is to configure proper stonith (configure + test),
then change drbd to use 'fencing resource-and-stonith;' and use the
'fence-peer "/usr/lib/drbd/crm-fence-peer.sh";' and
'before-resync-target "/usr/lib/drbd/crm-unfence-peer.sh";'.
That way, you avoid split-brains entirely. You also need stonith in
pacemaker anyway, so win-win.
On 15/04/15 08:49 PM, Keith Ouellette wrote:
> We have two nodes that have two drbd resources for two different
> applications on a pair of servers managed by Pacemaker. All looks to
> work fine when the primary node is put into standby or power cycled.
> Meaning that the drbd Primary gets moved to the new active node and the
> applications continue to run as expected. I have an issue when I pull
> the Ethernet out of the primary node and let it sit there for about a
> half hour. When I unplug it the Primary gets moved as expected and the
> applications continue to work. However, when I plug the Ethernet back
> into the system, both nodes go into a standalone state.
>
>
>
> *Node 1:*
>
>
>
> drbd driver loaded OK; device status:
> version: 8.4.3 (api:1/proto:86-101)
> srcversion: F97798065516C94BE0F27DC
> m:res cs ro ds p mounted
> fstype
> 0:r0 StandAlone Primary/Unknown UpToDate/DUnknown r----- ext4
> 1:r1 StandAlone Primary/Unknown UpToDate/DUnknown r----- ext4
>
>
>
> *Node 2:*
>
>
>
> drbd driver loaded OK; device status:
> version: 8.4.3 (api:1/proto:86-101)
> srcversion: F97798065516C94BE0F27DC
> m:res cs ro ds p
> mounted fstype
> 0:r0 StandAlone Secondary/Unknown UpToDate/DUnknown r-----
> 1:r1 StandAlone Secondary/Unknown UpToDate/DUnknown r-----
>
>
>
> As you can see one knows it is Primary and that is what the applications
> continue to run on. The second node knows it should be Secondary. All I
> do to resolve this is connect the resources on each node with the
> Secondary having the –discard-my-data option.
>
>
>
> Is there a way to have the connects done automatically. This looks to be
> a type of “split brain’ and I do have that configured in the
> global.common.conf:
>
>
>
> global {
> usage-count no;
> # minor-count dialog-refresh disable-ip-verification
> }
>
> common {
> handlers {
> # These are EXAMPLE handlers only.
> # They may have severe implications,
> # like hard resetting the node under certain circumstances.
> # Be careful when chosing your poison.
>
> # pri-on-incon-degr
> "/usr/lib/drbd/notify-pri-on-incon-degr.sh;
> /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ;
> reboot -f";
> # pri-lost-after-sb
> "/usr/lib/drbd/notify-pri-lost-after-sb.sh;
> /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ;
> reboot -f";
> # local-io-error "/usr/lib/drbd/notify-io-error.sh;
> /usr/lib/drbd/notify-emergency-shutdown.sh; echo o > /proc/sysrq-trigger
> ; halt -f";
> # fence-peer "/usr/lib/drbd/crm-fence-peer.sh";
> * split-brain "/usr/lib/drbd/notify-split-brain.sh root";*
> # out-of-sync "/usr/lib/drbd/notify-out-of-sync.sh root";
> # before-resync-target
> "/usr/lib/drbd/snapshot-resync-target-lvm.sh -p 15 -- -c 16k";
> # after-resync-target
> /usr/lib/drbd/unsnapshot-resync-target-lvm.sh;
> }
>
> startup {
> # wfc-timeout degr-wfc-timeout outdated-wfc-timeout
> wait-after-sb
> }
>
> options {
> # cpu-mask on-no-data-accessible
> }
>
> disk {
> # size max-bio-bvecs on-io-error fencing disk-barrier
> disk-flushes
> # disk-drain md-flushes resync-rate resync-after al-extents
> # c-plan-ahead c-delay-target c-fill-target c-max-rate
> # c-min-rate disk-timeout
> }
>
> net {
> * after-sb-0pri discard-zero-changes;
> after-sb-1pri discard-secondary;
> # after-sb-2pri consensus;
> after-sb-2pri disconnect;*
> # protocol timeout max-epoch-size max-buffers
> unplug-watermark
> # connect-int ping-int sndbuf-size rcvbuf-size ko-count
> # allow-two-primaries cram-hmac-alg shared-secret
> after-sb-0pri
> # after-sb-1pri after-sb-2pri always-asbp rr-conflict
> # ping-timeout data-integrity-alg tcp-cork on-congestion
> # congestion-fill congestion-extents csums-alg verify-alg
> # use-rle
> }
> }
>
>
>
> The following are also the resource files:
>
>
>
> r0.res:
>
>
>
> resource r0 {
> on Node1 {
> volume 0 {
> device /dev/drbd0;
> disk /dev/ Node1-vg/AOS;
> flexible-meta-disk internal;
> }
> address 10.0.6.221:7788;
> }
> on Node2 {
> volume 0 {
> device /dev/drbd0;
> disk /dev/ Node2-vg/AOS;
> flexible-meta-disk internal;
> }
> address 10.0.6.222:7788;
> }
> }
>
>
>
> r1.res:
>
>
>
> resource r1 {
> on Node1 {
> volume 0 {
> device /dev/drbd1;
> disk /dev/ Node1-vg/Controller;
> flexible-meta-disk internal;
> }
> address 10.0.6.221:7789;
> }
> on Node2 {
> volume 0 {
> device /dev/drbd1;
> disk /dev/ Node2-vg/Controller;
> flexible-meta-disk internal;
> }
> address 10.0.6.222:7789;
> }
> }
>
>
>
> I am not sure if this is possible, but I figured I would ask.
>
>
>
> Thanks,
> Keith
>
>
>
> <http://www.fibermountain.com>
>
> Keith Ouellette
>
>
> /KeithO at fibermountain.com/
>
> 700 West Johnson Avenue
> Cheshire, CT06410
> www.fibermountain.com
>
>
>
>
>
> P. (203) 806-4046
> C. (860) 810-4877
> F. (845) 358-7882
>
>
>
>
>
> Disclaimer: The information contained in this communication is
> confidential, may be privileged and is intended for the exclusive use of
> the above named addressee(s). If you are not the intended recipient(s),
> you are expressly prohibited from copying, distributing, disseminating,
> or in any other way using any information contained within this
> communication. If you have received this communication in error, please
> contact the sender by telephone or by response via mail. We have taken
> precautions to minimize the risk of transmitting software viruses, but
> we advise you to carry out your own virus checks on this message, as
> well as any attachments. We cannot accept liability for any loss or
> damage caused by software viruses.
>
>
>
> _______________________________________________
> drbd-user mailing list
> drbd-user at lists.linbit.com
> http://lists.linbit.com/mailman/listinfo/drbd-user
>
--
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without
access to education?