Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
On Tue, Oct 30, 2012 at 10:03:09PM +0400, Zohair Raza wrote: > On Tue, Oct 30, 2012 at 7:58 PM, Digimer <lists at alteeve.ca> wrote: > > On 10/30/2012 05:43 AM, Zohair Raza wrote: > > > I have rebuild the setup, and enabled fencing > > > > > > Manual fencing (fence_ack_manual) works okay when I fence one a dead > > > node from command line but it is not doing automatically It is called "Manual fencing" for a reason ... > > Manual fencing is not in any way supported. You must be able to call > > 'fence_node <peer>' and have the remote node reset. If this doesn't > > happen, your fencing is not sufficient. > fence_node <peer> doesn't work for me > > fence_node node2 says > > fence node2 failed Which is why you need a *real* fencing device for automatic fencing. > > > Logs: > > > Oct 30 12:05:52 node1 kernel: dlm: closing connection to node 2 > > > Oct 30 12:05:52 node1 fenced[1414]: fencing node node2 > > > Oct 30 12:05:52 node1 kernel: GFS2: fsid=cluster1:gfs.0: jid=1: Trying > > > to acquire journal lock... > > > Oct 30 12:05:52 node1 fenced[1414]: fence node2 dev 0.0 agent > > > fence_manual result: error from agent > > > Oct 30 12:05:52 node1 fenced[1414]: fence node2 failed > > > Oct 30 12:05:55 node1 fenced[1414]: fencing node node2 > > > > > > Cluster.conf: > > > > > > <?xml version="1.0"?> > > > <cluster name="cluster1" config_version="3"> > > > <cman two_node="1" expected_votes="1"/> > > > <clusternodes> > > > <clusternode name="node1" votes="1" nodeid="1"> > > > <fence> > > > <method name="single"> > > > <device name="manual" ipaddr="192.168.23.128"/> > > > </method> > > > </fence> > > > </clusternode> > > > <clusternode name="node2" votes="1" nodeid="2"> > > > <fence> > > > <method name="single"> > > > <device name="manual" ipaddr="192.168.23.129"/> > > > </method> > > > </fence> > > > </clusternode> > > > </clusternodes> > > > <fence_daemon clean_start="0" post_fail_delay="0" post_join_delay="3"/> > > > <fencedevices> > > > <fencedevice name="manual" agent="fence_manual"/> > > > </fencedevices> > > > </cluster> > > > > > > drbd.conf: > > > > > > resource res0 { > > > protocol C; > > > startup { > > > wfc-timeout 20; > > > degr-wfc-timeout 10; > > > # we will keep this commented until tested successfully: > > > become-primary-on both; > > > } > > > net { > > > # the encryption part can be omitted when using a dedicated link for > > > DRBD only: > > > # cram-hmac-alg sha1; > > > # shared-secret anysecrethere123; > > > allow-two-primaries; > > > } > > > on node1 { > > > device /dev/drbd0; > > > disk /dev/sdb1; > > > address 192.168.23.128:7789 <http://192.168.23.128:7789>; > > > meta-disk internal; > > > } > > > on node2 { > > > device /dev/drbd0; > > > disk /dev/sdb1; > > > address 192.168.23.129:7789 <http://192.168.23.129:7789>; > > > meta-disk internal; > > > } > > > } > > > > > > Regards, > > > Zohair Raza > > > > > > > > > On Tue, Oct 30, 2012 at 12:39 PM, Zohair Raza > > > <engineerzuhairraza at gmail.com <mailto:engineerzuhairraza at gmail.com>> > > wrote: > > > > > > Hi, > > > > > > thanks for explanation > > > > > > On Mon, Oct 29, 2012 at 9:26 PM, Digimer <lists at alteeve.ca > > > <mailto:lists at alteeve.ca>> wrote: > > > > > > When a node stops responding, it can not be assumes to be dead. > > > It has > > > to be put into a known state, and that is what fencing does. > > > Disabling > > > fencing is like driving without a seatbelt. > > > > > > Ya, it'll save you a bit of time at first, but the first time > > > you get a > > > split-brain, you are going right through the windshield. Will > > > you think > > > it was worth it then? > > > > > > A split-brain is when neither node fails, but they can't > > communicate > > > anymore. If each assumes the other is gone and begins using the > > > shared > > > storage without coordinating with it's peer, your data will be > > > corrupted > > > very, very quickly. > > > > > > > > > In my scenario, I have two Samba servers in two different locations > > > so chances of this are obvious > > > > > > > > > Heck, even if all you had was a floating IP; disabling fencing > > means > > > that both nodes would try to use that IP. As what your > > > clients/switches/routers think of that. > > > > > > I don't need floating IP, as I am not looking for high availability > > > but two Samba servers synced with each other so roaming employees > > > can have faster access to their files on both locations. I skipped > > > fencing as per Mautris's suggestion but I still can't figure out why > > > fencing daemon was not able to fence the other node. > > > > > > > > > Please read this for more details; > > > > > > > > https://alteeve.ca/w/2-Node_Red_Hat_KVM_Cluster_Tutorial#Concept.3B_Fencing -- : Lars Ellenberg : LINBIT | Your Way to High Availability : DRBD/HA support and consulting http://www.linbit.com DRBD® and LINBIT® are registered trademarks of LINBIT, Austria. __ please don't Cc me, but send to list -- I'm subscribed