Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
We have two nodes that have two drbd resources for two different applications on a pair of servers managed by Pacemaker. All looks to work fine when the primary node is put into standby or power cycled. Meaning that the drbd Primary gets moved to the new active node and the applications continue to run as expected. I have an issue when I pull the Ethernet out of the primary node and let it sit there for about a half hour. When I unplug it the Primary gets moved as expected and the applications continue to work. However, when I plug the Ethernet back into the system, both nodes go into a standalone state. Node 1: drbd driver loaded OK; device status: version: 8.4.3 (api:1/proto:86-101) srcversion: F97798065516C94BE0F27DC m:res cs ro ds p mounted fstype 0:r0 StandAlone Primary/Unknown UpToDate/DUnknown r----- ext4 1:r1 StandAlone Primary/Unknown UpToDate/DUnknown r----- ext4 Node 2: drbd driver loaded OK; device status: version: 8.4.3 (api:1/proto:86-101) srcversion: F97798065516C94BE0F27DC m:res cs ro ds p mounted fstype 0:r0 StandAlone Secondary/Unknown UpToDate/DUnknown r----- 1:r1 StandAlone Secondary/Unknown UpToDate/DUnknown r----- As you can see one knows it is Primary and that is what the applications continue to run on. The second node knows it should be Secondary. All I do to resolve this is connect the resources on each node with the Secondary having the -discard-my-data option. Is there a way to have the connects done automatically. This looks to be a type of "split brain' and I do have that configured in the global.common.conf: global { usage-count no; # minor-count dialog-refresh disable-ip-verification } common { handlers { # These are EXAMPLE handlers only. # They may have severe implications, # like hard resetting the node under certain circumstances. # Be careful when chosing your poison. # pri-on-incon-degr "/usr/lib/drbd/notify-pri-on-incon-degr.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f"; # pri-lost-after-sb "/usr/lib/drbd/notify-pri-lost-after-sb.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f"; # local-io-error "/usr/lib/drbd/notify-io-error.sh; /usr/lib/drbd/notify-emergency-shutdown.sh; echo o > /proc/sysrq-trigger ; halt -f"; # fence-peer "/usr/lib/drbd/crm-fence-peer.sh"; split-brain "/usr/lib/drbd/notify-split-brain.sh root"; # out-of-sync "/usr/lib/drbd/notify-out-of-sync.sh root"; # before-resync-target "/usr/lib/drbd/snapshot-resync-target-lvm.sh -p 15 -- -c 16k"; # after-resync-target /usr/lib/drbd/unsnapshot-resync-target-lvm.sh; } startup { # wfc-timeout degr-wfc-timeout outdated-wfc-timeout wait-after-sb } options { # cpu-mask on-no-data-accessible } disk { # size max-bio-bvecs on-io-error fencing disk-barrier disk-flushes # disk-drain md-flushes resync-rate resync-after al-extents # c-plan-ahead c-delay-target c-fill-target c-max-rate # c-min-rate disk-timeout } net { after-sb-0pri discard-zero-changes; after-sb-1pri discard-secondary; # after-sb-2pri consensus; after-sb-2pri disconnect; # protocol timeout max-epoch-size max-buffers unplug-watermark # connect-int ping-int sndbuf-size rcvbuf-size ko-count # allow-two-primaries cram-hmac-alg shared-secret after-sb-0pri # after-sb-1pri after-sb-2pri always-asbp rr-conflict # ping-timeout data-integrity-alg tcp-cork on-congestion # congestion-fill congestion-extents csums-alg verify-alg # use-rle } } The following are also the resource files: r0.res: resource r0 { on Node1 { volume 0 { device /dev/drbd0; disk /dev/ Node1-vg/AOS; flexible-meta-disk internal; } address 10.0.6.221:7788; } on Node2 { volume 0 { device /dev/drbd0; disk /dev/ Node2-vg/AOS; flexible-meta-disk internal; } address 10.0.6.222:7788; } } r1.res: resource r1 { on Node1 { volume 0 { device /dev/drbd1; disk /dev/ Node1-vg/Controller; flexible-meta-disk internal; } address 10.0.6.221:7789; } on Node2 { volume 0 { device /dev/drbd1; disk /dev/ Node2-vg/Controller; flexible-meta-disk internal; } address 10.0.6.222:7789; } } I am not sure if this is possible, but I figured I would ask. Thanks, Keith [cid:fm-logo.jpg]<http://www.fibermountain.com> [cid:2015FMI.jpg] Keith Ouellette KeithO at fibermountain.com 700 West Johnson Avenue Cheshire, CT06410 www.fibermountain.com [cid:redline.jpg] P. (203) 806-4046 C. (860) 810-4877 F. (845) 358-7882 Disclaimer: The information contained in this communication is confidential, may be privileged and is intended for the exclusive use of the above named addressee(s). If you are not the intended recipient(s), you are expressly prohibited from copying, distributing, disseminating, or in any other way using any information contained within this communication. If you have received this communication in error, please contact the sender by telephone or by response via mail. We have taken precautions to minimize the risk of transmitting software viruses, but we advise you to carry out your own virus checks on this message, as well as any attachments. We cannot accept liability for any loss or damage caused by software viruses. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20150416/95fbf13d/attachment.htm> -------------- next part -------------- A non-text attachment was scrubbed... Name: fm-logo.jpg Type: image/jpeg Size: 18744 bytes Desc: fm-logo.jpg URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20150416/95fbf13d/attachment.jpg> -------------- next part -------------- A non-text attachment was scrubbed... Name: 2015FMI.jpg Type: image/jpeg Size: 20461 bytes Desc: 2015FMI.jpg URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20150416/95fbf13d/attachment-0001.jpg> -------------- next part -------------- A non-text attachment was scrubbed... Name: redline.jpg Type: image/jpeg Size: 538 bytes Desc: redline.jpg URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20150416/95fbf13d/attachment-0002.jpg>