[DRBD-user] drdb for hundreds of virtual servers with cheap hardware

Graham Wood drbd at spam.dragonhold.org
Tue Oct 16 15:40:13 CEST 2007

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.



----- Message from adrien at modulis.ca ---------
> -will NFS do the job? (knowing that there will not be simultaneous access of
> the same data from different virtual servers). Note: the virtual server will
> be relatively small 300Mo each, they will be stored in a folder not an image
> (kinda like chroot).
What virtualization method are you using?  All the ones that really  
separate things out (other than Zones) that I've used require a block  
device, not a directory....

NFS is able to do it, but your commodity hardware may not be able to  
handle the throughput you're looking for using NFS - a lot depends on  
the traffic patterns, load, etc.  Your network diagram also shows each  
storage server as only having a single network link to the backbone -  
which is probably not what you want.  I'd suggest a couple of changes:

1. Use 2 "partitions" within the RAID1, and have both servers active  
(nfs-virt-1 and nfs-virt-2 if you like) to maximize performance (only  
in failover conditions will you have everything running off a single  
server.

2. Use 4 network connections for the storage servers - a pair for the  
DRBD link and a pair for the front end connection.  It removes the 2  
SPoF (Single Point of Failure) that your diagram has there

3. If you can afford it, I'd use 4 network connections for the  
virtualization servers 2.  A pair to the backend storage and another  
pair to the front end for user usage.

> -will Heartbeat "guarantee" that failover his made transparently without
> human intervention ?
I use the redhat cluster suite rather than heartbeat, and NFS running  
on that does this quite happily for me.  I'm using DRBD as the back  
end for a shared LVM arrangement - this provides my storage for a DB,  
user home directories, mail server, etc.  I'm using the RH cluster  
rather than heartbeat because it seems to have better options for  
managing the services (e.g. most of the time I have 4 running on  
serverA and 1 running on serverB - my VoIP stuff gets its own server  
normally)

> -My physical virtualization servers will be diskless to simplify management
> (only swap will be on a local partition), is it a bad idea - could it
> decrease performance ?
How much of an affect this would have depends on the method of doing  
the virtualization as much as anything else.  If the OS is primarily  
"passive", and therefore not accessed much, then this should be fine -  
although if you're using local swap then it's almost as easy to have a  
really simple OS image on it - which could reduce your network  
traffic.  Most linux distributions allow for very easy/quick  
provisioning, so you could even not bother with RAID on the servers.   
I'm using FAI to do my debian installs, and I can reinstall a node in  
my cluster in approximately 4 minutes - not counting the DRBD resync.

> -Can DRDB save me the RAID 1 setup ? so that I can use RAID 0 and double my
> capacity without affecting nfs service in case of hard disk failure ?
Yes and no.  You have 2 copies of the data, so the system could cope  
with a failure - but you then have no extra redundancy at all.   
Considering the time it'll take to rebuild a 750GB DRBD device (and  
the associated performance reduction), I think that the $100 or so  
saving per disk just wouldn't be worth it.

> -Has anyone run software RAID 5 and DRDB, or the overhead is too important ?
I've done it, but personally really don't like RAID5 anyway...   
Considering the relative costs of disks and servers I'd probably stick  
with mirroring.  50 virtual machines per physical only works out  
(using your 300MB figure) as 150GB - so each pair of 750GB disks can  
handle approximately 5 full virtualization servers... The performance  
of this ( 250 different machines fighting to access a single spindle)  
is something that you'll need to check, it could easily blow up if the  
load is high enough

> -A another scenario would be to use the local disk (80Go) of each
> virtualization servers (no more pxe or nfs) and have DRDB duplicate the
> local disk to a same-size partition on a RAID 5 one server NAS. Do you think
> this second scenario would be better in terms of uptime ?
It is likely to be a much higher performance answer than using NFS as  
the back end, but you need to think about failover to decide whether  
it is better.  If you are using the single pair of "storage" servers,  
then if one of your virtualization servers dies you can evenly  
distribute the virtualized servers across the rest of the estate to  
wherever you want.  If you mirror the local disk to a single back end,  
how do you bring up that virtual machine somewhere else?

A more "generic" solution would be to totally decentralize the  
solution - and automatically allocate virtual servers a primary and  
secondary location.  Autogenerate the DRBD configuration files from  
this pairing, and you can split the load more evenly.  That gives you  
a situation where losing a single virtualization server just slightly  
increases the load everywhere without much additional effort - and  
would remove the bottleneck of a small number of storage servers (e.g.  
all virtualization servers talking to each other, rather than all of  
them talking to 1 or 2 backend stores).

Using a very quick and simple model, I reckon that with 10 physical  
servers, each with 9 virtual servers, this would give you:
0 failures = 90 available, all servers running 9 virtual machines
1 failure  = 90 available, all servers running 10 virtual machines
2 failures = 89 available, 7 servers running 11, 1 server running 12

Graham



More information about the drbd-user mailing list