Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hi Colin, Inline reply below: On Fri, Oct 15, 2010 at 2:01 PM, Colin Simpson <Colin.Simpson at iongeo.com>wrote: > Hi > > I have a working test cluster RH Cluster Suite with various GFS2 file > systems on top of a DRBD Primary/Primary device. > > I have the recommended GFS setup in drbd.conf i.e > > allow-two-primaries; > after-sb-0pri discard-zero-changes; > after-sb-1pri discard-secondary; > after-sb-2pri disconnect; > > Now I have been trying to think of the danger scenarios that might arise > with my setup. > > So I have a few questions (maybe quite a few): > > 1/ When one node is brought back up after being down it starts to sync > up to the "newer" copy (I'm hoping). > > I presume GFS shouldn't be mounted at this point on the just brought up > node (as data will not be consistent between the two GFS mounts and the > block device will be changing underneath it)? > The drbd service should start before the clvmd service. The syncing node will sync and/or be immediately ready for use when clvmd comes up. I do this to assure this is the case: /usr/bin/patch <<EOF --- clvmd.orig 2010-09-13 17:15:17.000000000 -0500 +++ clvmd 2010-09-13 17:36:46.000000000 -0500 @@ -7,6 +7,8 @@ # ### BEGIN INIT INFO # Provides: clvmd +# Required-Start: drbd +# Required-Stop: drbd # Short-Description: Clustered LVM Daemon ### END INIT INFO EOF /usr/bin/patch <<EOF --- drbd.orig 2010-09-13 17:15:17.000000000 -0500 +++ drbd 2010-09-13 17:39:46.000000000 -0500 @@ -15,8 +15,8 @@ # Should-Stop: sshd multipathd # Default-Start: 2 3 4 5 # Default-Stop: 0 1 6 -# X-Start-Before: heartbeat corosync -# X-Stop-After: heartbeat corosync +# X-Start-Before: heartbeat corosync clvmd +# X-Stop-After: heartbeat corosync clvmd # Short-Description: Control drbd resources. ### END INIT INFO EOF cd - # setup proper order and make sure it sticks for X in drbd clvmd ; do /sbin/chkconfig $X resetpriorities done I mean, does it or is there any way of running drbd so it ignores the > out of date primary's data (on the node just brought up) and passes all > the requests through to the "good" primary (until it is sync'd)? > That's what it does from my observation. > > Should I have my own start up script to only start cman and clvmd when I > finally see > > 1: cs:Connected st:Primary/Primary ds:UpToDate/UpToDate > > and not > > 1: cs:SyncTarget st:Primary/Primary ds:Inconsistent/UpToDate > > , what is recommended (or what do people do)? Or is there some way of > achieving this already? > Nah. Just make sure drbd starts before clvmd. > > Just starting up cman still seems to try to start services that then > fail out even before clvmd is running (including services that are > children of FS's in the cluster.conf file): > > <clusterfs fstype="gfs" ref="datahome"> > <nfsexport ref="tcluexports"> > <nfsclient name=" " ref="NFSdatahomeclnt"/> > </nfsexport> > </clusterfs> > > So I'm presuming I need to delay starting cman and clvmd and not just > clvmd? > clvmd should be dependent on drbd, that is all. > > I'd like automatic cluster recovery. > > 2/ Is discard-older-primary not better in a Primary/Primary? Or is it > inappropriate in dual Primary? > With the split-brain settings you mentioned further up, you have automatic recovery for the safe cases. Depending on your data, "discard-least-changes" may be a policy you can look at. For the non-safe cases, I prefer human intervention personally. > > 3/ Is there any merit in stopping one node first always so you know for > start up which one has the most up to date data (say if their is a start > up PSU failure)? Will a shutdown DRBD node with a stopped GFS and drbd > still have a consistent (though out of date file system)? > DRBD metadata tracks which one is most up-to-date. > > 4/ I was thinking the bad (hopefully unlikely) scenario where you bring > up an out of date node A (older than B's data), it maybe hopefully comes > up clean (if the above question allows). It starts working, some time > later you bring up node B which originally had a later set of data > before A and B went down originally. > That should be prevented by something like: startup { wfc-timeout 0 ; # Wait forever for initial connection degr-wfc-timeout 60; # Wait only 60 seconds if this node was a degraded cluster } "A" would wait indefinitely for "B" to start. Only if you manually goto the console and type "yes" to abort the wfc-timeout will "A" come up inconsistent. > Based on the recommended config. Will B now take all A's data ? > Nope. You have to manually resolve. > > 4/ Is it good practice (or even possible) to use the same private > interface for RH Cluster comms, clvmd etc (OpenAIS) that drbd uses? RHCS > seems to make this hard, use an internal interface for cluster comms and > have the services presented on a different interface. > That's a performance issue and depends on how fast your interconnect is. If your backing-storage can saturate the link DRBD is over, you'll want to run the totem protocol over a different interconnect. If you're using something like InfiniBand or 10Gbe it likely will not be a problem unless you have some wicked-fast solid-state backing storage. Cheers, -JR -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20101018/e613142d/attachment.htm>