Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hi I have a working test cluster RH Cluster Suite with various GFS2 file systems on top of a DRBD Primary/Primary device. I have the recommended GFS setup in drbd.conf i.e allow-two-primaries; after-sb-0pri discard-zero-changes; after-sb-1pri discard-secondary; after-sb-2pri disconnect; Now I have been trying to think of the danger scenarios that might arise with my setup. So I have a few questions (maybe quite a few): 1/ When one node is brought back up after being down it starts to sync up to the "newer" copy (I'm hoping). I presume GFS shouldn't be mounted at this point on the just brought up node (as data will not be consistent between the two GFS mounts and the block device will be changing underneath it)? It seems to have caused Oops's in GFS kernel modules when I have tried before. I mean, does it or is there any way of running drbd so it ignores the out of date primary's data (on the node just brought up) and passes all the requests through to the "good" primary (until it is sync'd)? Should I have my own start up script to only start cman and clvmd when I finally see 1: cs:Connected st:Primary/Primary ds:UpToDate/UpToDate and not 1: cs:SyncTarget st:Primary/Primary ds:Inconsistent/UpToDate , what is recommended (or what do people do)? Or is there some way of achieving this already? Just starting up cman still seems to try to start services that then fail out even before clvmd is running (including services that are children of FS's in the cluster.conf file): <clusterfs fstype="gfs" ref="datahome"> <nfsexport ref="tcluexports"> <nfsclient name=" " ref="NFSdatahomeclnt"/> </nfsexport> </clusterfs> So I'm presuming I need to delay starting cman and clvmd and not just clvmd? I'd like automatic cluster recovery. 2/ Is discard-older-primary not better in a Primary/Primary? Or is it inappropriate in dual Primary? 3/ Is there any merit in stopping one node first always so you know for start up which one has the most up to date data (say if their is a start up PSU failure)? Will a shutdown DRBD node with a stopped GFS and drbd still have a consistent (though out of date file system)? 4/ I was thinking the bad (hopefully unlikely) scenario where you bring up an out of date node A (older than B's data), it maybe hopefully comes up clean (if the above question allows). It starts working, some time later you bring up node B which originally had a later set of data before A and B went down originally. Based on the recommended config. Will B now take all A's data ? Will you end up with a mishmash of A and B's data at the block level (upsetting GFS)? Or will A take B's data? B taking all A's data seems best to me (least worst), as things may well have moved on quite a bit and we'd hope B wasn't too far behind when it went down. 4/ Is it good practice (or even possible) to use the same private interface for RH Cluster comms, clvmd etc (OpenAIS) that drbd uses? RHCS seems to make this hard, use an internal interface for cluster comms and have the services presented on a different interface. For reference my drbd.conf test version is below. Hopefully this is pretty clear, though I'm not convinced I've been.... Thanks Colin global { usage-count yes; } common { protocol C; } resource r0 { syncer { verify-alg md5; rate 70M; } startup { become-primary-on both; } on edi1tcn1 { device /dev/drbd1; disk /dev/sda3; address 192.168.9.61:7789; meta-disk internal; } on edi1tcn2 { device /dev/drbd1; disk /dev/sda3; address 192.168.9.62:7789; meta-disk internal; } net { allow-two-primaries; after-sb-0pri discard-zero-changes; after-sb-1pri discard-secondary; after-sb-2pri disconnect; } } This email and any files transmitted with it are confidential and are intended solely for the use of the individual or entity to whom they are addressed. If you are not the original recipient or the person responsible for delivering the email to the intended recipient, be advised that you have received this email in error, and that any use, dissemination, forwarding, printing, or copying of this email is strictly prohibited. If you received this email in error, please immediately notify the sender and delete the original.