Hi Graham,<br><br>Thanks for your time, my answers below:<br><br><div><span class="gmail_quote">On 10/16/07, <b class="gmail_sendername">Graham Wood</b> <<a href="mailto:drbd@spam.dragonhold.org">drbd@spam.dragonhold.org
</a>> wrote:</span><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"><br><br>----- Message from <a href="mailto:adrien@modulis.ca">adrien@modulis.ca
</a> ---------<br>> -will NFS do the job? (knowing that there will not be simultaneous access of<br>> the same data from different virtual servers). Note: the virtual server will<br>> be relatively small 300Mo each, they will be stored in a folder not an image
<br>> (kinda like chroot).<br>What virtualization method are you using? All the ones that really<br>separate things out (other than Zones) that I've used require a block<br>device, not a directory....</blockquote>
<div><br><br>I'm using vserver which is closer to freebsd chroot jailed than xen or vmware<br>vserver just "chroot" all the process of the vserver. All the virtual servers share the same kernel.<br> </div><br>
<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">NFS is able to do it, but your commodity hardware may not be able to<br>handle the throughput you're looking for using NFS - a lot depends on
<br>the traffic patterns, load, etc. Your network diagram also shows each<br>storage server as only having a single network link to the backbone -<br>which is probably not what you want. I'd suggest a couple of changes:
<br><br>1. Use 2 "partitions" within the RAID1, and have both servers active<br>(nfs-virt-1 and nfs-virt-2 if you like) to maximize performance (only<br>in failover conditions will you have everything running off a single
<br>server.</blockquote><div><br>That's a good idea !<br> </div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">2. Use 4 network connections for the storage servers - a pair for the
<br>DRBD link and a pair for the front end connection. It removes the 2<br>SPoF (Single Point of Failure) that your diagram has there</blockquote><div><br>I can do it too.<br></div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
3. If you can afford it, I'd use 4 network connections for the<br>virtualization servers 2. A pair to the backend storage and another<br>pair to the front end for user usage.<br><br>> -will Heartbeat "guarantee" that failover his made transparently without
<br>> human intervention ?<br>I use the redhat cluster suite rather than heartbeat, and NFS running<br>on that does this quite happily for me. I'm using DRBD as the back<br>end for a shared LVM arrangement - this provides my storage for a DB,
<br>user home directories, mail server, etc. I'm using the RH cluster<br>rather than heartbeat because it seems to have better options for<br>managing the services (e.g. most of the time I have 4 running on<br>serverA and 1 running on serverB - my VoIP stuff gets its own server
<br>normally)</blockquote><div><br><br>I will have a look at redhat cluster suite, but it seems more complicated to setup than heartbeat.<br></div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
> -My physical virtualization servers will be diskless to simplify management<br>> (only swap will be on a local partition), is it a bad idea - could it<br>> decrease performance ?<br>How much of an affect this would have depends on the method of doing
<br>the virtualization as much as anything else. If the OS is primarily<br>"passive", and therefore not accessed much, then this should be fine -<br>although if you're using local swap then it's almost as easy to have a
<br>really simple OS image on it - which could reduce your network<br>traffic. Most linux distributions allow for very easy/quick<br>provisioning, so you could even not bother with RAID on the servers.<br>I'm using FAI to do my debian installs, and I can reinstall a node in
<br>my cluster in approximately 4 minutes - not counting the DRBD resync.</blockquote><div><br>The goal of the pxe boot is to save me a KVM if I screw up a boot process. But I can easily do an automated network install on pxe.
<br> </div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">> -Can DRDB save me the RAID 1 setup ? so that I can use RAID 0 and double my
<br>> capacity without affecting nfs service in case of hard disk failure ?<br>Yes and no. You have 2 copies of the data, so the system could cope<br>with a failure - but you then have no extra redundancy at all.<br>Considering the time it'll take to rebuild a 750GB DRBD device (and
<br>the associated performance reduction), I think that the $100 or so<br>saving per disk just wouldn't be worth it.</blockquote><div><br><br>Good point, I haven't accounted for the DRDB rebuilding time ... <br></div>
<br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">> -Has anyone run software RAID 5 and DRDB, or the overhead is too important ?<br>I've done it, but personally really don't like RAID5 anyway...
<br>Considering the relative costs of disks and servers I'd probably stick<br>with mirroring. 50 virtual machines per physical only works out<br>(using your 300MB figure) as 150GB - so each pair of 750GB disks can<br>
handle approximately 5 full virtualization servers... The performance<br>of this ( 250 different machines fighting to access a single spindle)<br>is something that you'll need to check, it could easily blow up if the<br>
load is high enough</blockquote><div><br>The voip servers doesn't do to much IO (they are more cpu intensive)<br> </div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
> -A another scenario would be to use the local disk (80Go) of each<br>> virtualization servers (no more pxe or nfs) and have DRDB duplicate the<br>> local disk to a same-size partition on a RAID 5 one server NAS. Do you think
<br>> this second scenario would be better in terms of uptime ?<br>It is likely to be a much higher performance answer than using NFS as<br>the back end, but you need to think about failover to decide whether<br>it is better. If you are using the single pair of "storage" servers,
<br>then if one of your virtualization servers dies you can evenly<br>distribute the virtualized servers across the rest of the estate to<br>wherever you want. If you mirror the local disk to a single back end,<br>how do you bring up that virtual machine somewhere else?
<br><br>A more "generic" solution would be to totally decentralize the<br>solution - and automatically allocate virtual servers a primary and<br>secondary location. Autogenerate the DRBD configuration files from
<br>this pairing, and you can split the load more evenly. That gives you<br>a situation where losing a single virtualization server just slightly<br>increases the load everywhere without much additional effort - and<br>would remove the bottleneck of a small number of storage servers (
e.g.<br>all virtualization servers talking to each other, rather than all of<br>them talking to 1 or 2 backend stores).</blockquote><div><br><br>I have think about this first, but I think it would be much more configuration and script making.
<br>The other problem with this setup is that most of the server I can buy have scsi interface only, so I can't easily increase the disk capacity.<br> </div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
Using a very quick and simple model, I reckon that with 10 physical<br>servers, each with 9 virtual servers, this would give you:<br>0 failures = 90 available, all servers running 9 virtual machines<br>1 failure = 90 available, all servers running 10 virtual machines
<br>2 failures = 89 available, 7 servers running 11, 1 server running 12<br><br>Graham</blockquote><div><br><br>Thanks a lot for your feedback, <br> <br>Adrien<br></div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
_______________________________________________<br>drbd-user mailing list<br><a href="mailto:drbd-user@lists.linbit.com">drbd-user@lists.linbit.com</a><br><a href="http://lists.linbit.com/mailman/listinfo/drbd-user">http://lists.linbit.com/mailman/listinfo/drbd-user
</a><br><br></blockquote></div><br><br clear="all"><br>-- <br>Adrien Laurent<br>(514) 284-2020 x 202<br><a href="http://www.modulis-groupe.com">http://www.modulis-groupe.com</a>