[DRBD-user] fence-peer

Mon Jul 1 11:39:24 CEST 2013

Hi,
the email you have quoted contains an attachment with my changes to the 
obliterate_peer script.
Download it from 
http://drbd.10923.n7.nabble.com/attachment/3301/0/rhcm-fence-peer.sh to 
/usr/lib/drbd
change the fence-peer line to:
fence-peer "/usr/lib/drbd/rhcm-fence-peer.sh on kvm5 10.2.2.50 on kvm6 
10.2.2.51";

in disk section change fencing to resource-and-stonith

You may (optionally) want to change the key used to connect to the peer:
SSH_CMD="ssh -i /root/.ssh/id_rsa"
and maybe disable logging to /tmp (at the end of the script) or replace 
it with an email notification

On 2013-06-29 07:41, cesar wrote:
> Hi Kaloyan Kovachev
> 
> I would humbly ask for your help.
> I will be very grateful if you can help me
> 
> I have Proxmox VE for HA of VMs (based on KVM) + LVM2 + DRBD 8.4.2. 
> (NIC to
> NIC)
> And have 1 VM in HA that uses 2 DRBD resources, ie for the VM  are 2 
> virtual
> disks.
> 
> My problem is that i lost connection with DRBD:
> 
> version: 8.4.2 (api:1/proto:86-101)
> GIT-hash: 7ad5f850d711223713d6dcadc3dd48860321070c build by root at kvm5,
> 2013-06-16 13:44:51
> 0: cs:StandAlone ro:Primary/Unknown ds:UpToDate/DUnknown   r-----
> ns:12443007 nr:0 dw:46472892 dr:1719821020 al:3096 bm:1408 lo:1 pe:0
> ua:0 ap:1 ep:1 wo:f oos:3319816
> 1: cs:StandAlone ro:Primary/Unknown ds:UpToDate/DUnknown   r-----
> ns:4764396 nr:0 dw:3791840 dr:1057033682 al:1164 bm:336 lo:0 pe:0 ua:0
> ap:0 ep:1 wo:f oos:1783620
> 
> I think that may be for 2 causes:
> Cause 1: The NIC is Realtek, as soon as I can i will change it for 
> Intel,
> Cause 2: I use the directive net data-integrity-alg md5; (I would not 
> want
> remove it) that can have problems with my PC ASUS P8H77-M PRO , also I 
> don't
> use raid.
> 
> *And Mr. Lars Ellenberg tell me that:*
> "With special purpose built fencing handlers,
> we may be able to fix your setup so it will freeze IO during the
> disconnected period, reconnect, and replay pending buffers,
> without any reset."
> 
> But for now i like to know how apply this change?,
> And since I am not an experienced in DRBD, I wish someone would show me 
> how
> it should be finally the
> global_common.conf   or  *.res configuration files.
> 
> Note: My Nodes have keys for ssh comunication
> 
> This is my actual configuration with DRBD 8.4.2 Version:
> global_common.conf file:
> 
> global {
> usage-count yes;
> # minor-count dialog-refresh disable-ip-verification
> }
> 
> common {
> protocol C;
> 
> handlers {
> pri-on-incon-degr
> "/usr/lib/drbd/notify-pri-on-incon-degr.sh;
> /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger 
> ;
> reboot -f";
> pri-lost-after-sb
> "/usr/lib/drbd/notify-pri-lost-after-sb.sh;
> /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger 
> ;
> reboot -f";
> local-io-error "/usr/lib/drbd/notify-io-error.sh;
> /usr/lib/drbd/notify-emergency-shutdown.sh; echo o > 
> /proc/sysrq-trigger ;
> halt -f";
> # fence-peer "/usr/lib/drbd/crm-fence-peer.sh";
> split-brain "/usr/lib/drbd/notify-split-brain.sh
> some-user at my-domain.com";
> out-of-sync "/usr/lib/drbd/notify-out-of-sync.sh
> some-user at my-domain.com";
> # before-resync-target
> "/usr/lib/drbd/snapshot-resync-target-lvm.sh -p 15 -- -c 16k";
> # after-resync-target
> /usr/lib/drbd/unsnapshot-resync-target-lvm.sh;
> }
> 
> startup {
> # wfc-timeout degr-wfc-timeout outdated-wfc-timeout
> wait-after-sb
> wfc-timeout 30; degr-wfc-timeout 20; outdated-wfc-timeout
> 15;
> }
> 
> options {
> # cpu-mask on-no-data-accessible
> cpu-mask 0;
> }
> 
> disk {
> # size max-bio-bvecs on-io-error fencing disk-barrier
> disk-flushes
> # disk-drain md-flushes resync-rate resync-after al-extents
> # c-plan-ahead c-delay-target c-fill-target c-max-rate
> # c-min-rate disk-timeout
> on-io-error detach; al-extents 3389; resync-rate 75M;
> }
> 
> net {
> # protocol timeout max-epoch-size max-buffers
> unplug-watermark
> # connect-int ping-int sndbuf-size rcvbuf-size ko-count
> # allow-two-primaries cram-hmac-alg shared-secret
> after-sb-0pri
> # after-sb-1pri after-sb-2pri always-asbp rr-conflict
> # ping-timeout data-integrity-alg tcp-cork on-congestion
> # congestion-fill congestion-extents csums-alg verify-alg
> # use-rle
> sndbuf-size 0; no-tcp-cork; unplug-watermark 16; max-buffers
> 8000; max-epoch-size 8000;
> data-integrity-alg md5;
> verify-alg sha1;
> }
> }
> 
> r0.res file:
> resource r0 {
> 
> protocol C;
> 
> startup {
> #wfc-timeout  15;
> #degr-wfc-timeout 60;
> become-primary-on both;
> }
> 
> net {
> #cram-hmac-alg sha1;
> #shared-secret "my-secret";
> allow-two-primaries;
> after-sb-0pri discard-zero-changes;
> after-sb-1pri discard-secondary;
> after-sb-2pri disconnect;
> }
> 
> on kvm5 {
> device /dev/drbd0;
> disk /dev/sda3;
> address 10.2.2.50:7788;
> meta-disk internal;
> }
> 
> on kvm6 {
> device /dev/drbd0;
> disk /dev/sda3;
> address 10.2.2.51:7788;
> meta-disk internal;
> }
> }
> 
> r1.res file:
> resource r1 {
> protocol C;
> startup {
> #wfc-timeout  15;
> #degr-wfc-timeout 60;
> become-primary-on both;
> }
> net {
> #cram-hmac-alg sha1;
> #shared-secret "my-secret";
> allow-two-primaries;
> after-sb-0pri discard-zero-changes;
> after-sb-1pri discard-secondary;
> after-sb-2pri disconnect;
> }
> on kvm5 {
> device /dev/drbd1;
> disk /dev/sdb3;
> address 10.2.2.50:7789;
> meta-disk internal;
> }
> on kvm6 {
> device /dev/drbd1;
> disk /dev/sdb3;
> address 10.2.2.51:7789;
> meta-disk internal;
> }
> }
> 
> Best regards
> Cesar
> 
> 
> 
> 
> 
> --
> View this message in context:
> http://drbd.10923.n7.nabble.com/fence-peer-tp3298p17982.html
> Sent from the DRBD - User mailing list archive at Nabble.com.
> _______________________________________________
> drbd-user mailing list
> drbd-user at lists.linbit.com
> http://lists.linbit.com/mailman/listinfo/drbd-user