[DRBD-user] DRBD+Heartbeat++LVM+Xen

José E. Colón Rodríguez jecolon at oss.cayey.upr.edu
Sat Jan 13 20:14:42 CET 2007

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Hello all:

This message isn't a HOWTO it's more of a heads-up and hopefully could  
help others to get a grip on the things you can do with this awesome  
technology.  On that note I first must congratulate and exoress my  
admiration for the folks developing DRBD, this is definitely the type  
of technology that enables a paradigm shift in how servers and storage  
are managed in many diverse scenarios.  It's also the key to achieving  
"The Poor Man's High Availability Cluster".

I dabbled a bit with 8pre6 and was able to get an active/active setup  
that permitted live migrations of Xen DomUs, but I had problems with  
heartbeat and the xen init scripts not being ready for this type of  
setup.  If I rebooted a node, I was the victim of "split-brain hell",  
thus requiring some sort of manual intervention.  I'm sure that with  
more thorough tweaking and planning of this setup, the problems could  
be ironed out but I just didn't have the time.

So I went back to the trusty old 7.22 and have a setup that's truly HA  
and automated.  It's a great feeling to walk up to one node and just  
press the power button and in a matter of seconds everything has  
failed over smoothly to the other box.  Better yet, when the shutdown  
node comes back up, the resources are returned automatically.  No  
human intervention required whatsoever.

So here's a quick glance over what I've got running.  If anyone needs  
more details let me know.

* Two node HA cluster composed of two Dual AMD Opteron based 1u rack  
mount boxes.  Identical configurations for both.

* Gigabit to the network and another Gigabit for the crossover link  
directly between both machines.

* CentOS 4.4 x86_64 on both nodes.  Minimal install: de-selected  
everything in the installer.

* Compiled Xen (lean Dom0 and DomU kernels)
* Compiled DRBD 7.22
* Heartbeat via yum
* Built-in LVM2 courtesy of CentOS

1.  Used fdisk to setup two partitions on a 146 GB drive.  I set the  
type to Linux LVM (8e).

2.  Setup drbd.conf to use one partition for r0 and the other for r1.   
These are drbd0 and drbd1 respectively.

3.  Setup /etc/lvm/lvm.conf to filter out the underlying partitions  
and add the drbd devices explicitly. Eg., [ "r|/dev/sda1|",  
"a|/dev/drbd0|", "r|/dev/sda2|", "a|/dev/drbd1|" ]

4.  Created an LVM PV on each drbd device.

5.  Created an LVM VG on each PV.

6.  Created two LVM LVs for each DomU one for Swap and the other for  
VBD.  I distributed these LVs amongst the two VGs which are atop the  
two separate drbd devices.

7.  Created two subdirectories under /etc/xen/auto: one for the DomUs  
to be active on node1 and the other for the DomUs to be active on node2.

8.  Put the config files for the DomUs distributed between the two new  
subdirectories.

9.  Copied /etc/sysconfig/xendomains twice (xd1 and xd2) in that  
directory since we need two configs for the two new subdirectories of  
/etc/xen/auto.  Modified these two new copies setting the  
XEN_DOMAINS_AUTO variable to one of the two new subdirs respectively.

10.  Likewise copied /etc/rc.d/init.d/xendomains twice, this time not  
in the same directory but in /etc/ha.d/resource.d since only heartbeat  
will be using these scripts.  Modified each copy to point to the  
corresponding config file we created in the previous step.

11.  Configured the /etc/ha.d/ha.cf file to enable autofailback ,  
specify the cluster node names and the ucast device and perr IP.

12.  Modified the /etc/ha.d/resource.d/LVM script so it would work on  
CentOS.  For some reason it was not able to detect the LVM version and  
balked on the command line switches to the LVM commands.

13.  Configured the /etc/ha.d/haresources to specify one resource line  
for each group of DomUs with corresponding drbddisk and LVM  
dependencies. Eg.
   node1 drbddisk::r0 LVM::vgr0 xd1
   node2 drbddisk::r1 LVM::vgr1 xd2

So what does all this produce?  Node 1 has N DomUs and so does Node 2.  
  Each set of DomUs is on its own drbd device and each node is primary  
for one of these devices.  When a node fails, heartbeat sets the other  
node as primary for the affected drbd device, activates the LVM VG and  
LVs and starts the affected set of DomUs via their custom xendomains  
script (xd1 or xd2).  It works great.  I've rebooted, pulled the plug,  
and hit the power button and everything fails over OK.  There's a  
slight delay of about 90 seconds since it isn't live migration but my  
environment can tolerate this.

Try it out, it's all good!
-- 
José E. Colón Rodríguez
Academic Computing Coordinator
University of Puerto Rico at Cayey
V.  787-738-2161 x. 2415, 2532
E.  jecolon at cayey.upr.edu





----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.






More information about the drbd-user mailing list