Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Thanks for the input. Your right in that 2 days is too little time to do this, so I'm going to manual route of shutting one server down at a time, migrating the virtual disks then bringing it back up on the remote site. To avoid more downtime of manual migration once this is all over with, I think I will first attempt just getting a DRBD resource up and running to sync my servers back to the primary datacenter. Can a DRBD resource on an existing LVM be done without effecting the data ? Also since I don't plain to have automatic failover, any precautions I should take if the network connection is lost between the two datacenters ? Ideally this would allow me to have minimal downtime while the nodes re-sync. Your correct I really don't make a habit of this. Unfortunately steam + fiber bundles = big mess. The main junction for my campus that supplies connectivity between about 2/3 of the buildings is now without protection and the glass is exposed. Even with 12 fiber splicers and multiple mobile clean rooms, this repair will take 4-5 days I'm told. Hopefully this event will allow my organization's executives to see how critical automatic failover and replication off-site can be. Thanks again for the input. - Trey On Wed, Dec 14, 2011 at 3:44 AM, Felix Frank <ff at mpexnet.de> wrote: > Hi, > > for your basic questions: Yes, your design idea is sound and it should > work without any major problems, see exceptions below. > > Getting this in production in 2 days time without any hands on Pacemaker > experience, though - that's one hell of a call. (I'm assuming this isn't > something you've yet made a habit of.) > > I suggest you focus on the DRBD side of things and see if you can > establish a synced resource. From your description of the situation, a > manual failover near the start of work hours will still be much > preferable a week of downtime, so it may suffice? > > As for the aforementioned exceptions: During network failure, you will > most definitely run into split-brain, i.e. the VMs in your datacenter > remain operational and do whatever work they were doing when > connectivity failed. The failover VMs will boot as though they had > crashed at the moment of network failure. So once connectivity is > restored, DRBD will tell you that in the datacenter, stuff has been > written that your failover VMs never knew about. > > Normally (if you had time and resources), you would implement STONITH > (a.k.a. fencing) to protect yourself by killing the original VMs during > failover. As your time frame probably won't allow you to set this up > properly ("testing? heh"), you may want to settle for the manual > approach: Anticipate the split brain and be ready to discard whatever > the original VMs saw fit to commit to their disks after getting > disconnected. > > HTH, > Felix > > On 12/13/2011 10:50 PM, Trey Dockendorf wrote: > > I have somewhat of an emergency on my hands, and am hoping the community > > may have some insight. One of the primary fiber rings on my campus will > > be down for a week, unless the damaged fiber fails, it will be down > > then. During this time my primary datacenter could possibly > > have intermittent connectivity to other buildings / outside work. > > Unfortunately we do not have the resources for a remote data center > > (yet), but for now I'm setting up a remote KVM server in a portion of > > campus that will not be effected. Currently all VMs are QCOW2 images > > that live on a 1.2TB Logical volume. This seems like the perfect > > situation to use DRBD in active-passive. However I'm not currently > > trying to prepare for hardware failure but network failure. Is it > > possible to do this, to have the two LVMs synced with DRBD and then have > > a resource manager (Pacemaker) detect that the two can't communicate > > (network failure) and then activate the passive node? I want to avoid > > as much complexity as possible, so no live migration or dual primary if > > possible. > > > > This is really a temporary solution for 1 week for all > > my organization's web servers, but it has made our executives aware of > > our need for more funds and a remote datacenter. So hopefully this > > won't be last I can use DRBD, but I have to do this in the next 2 days. > > > > Any help is greatly appreciated. > -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20111215/20a1bee3/attachment.htm>