Hi Graham,<br><br>Thanks for your time, my answers below:<br><br><div><span class="gmail_quote">On 10/16/07, <b class="gmail_sendername">Graham Wood</b> &lt;<a href="mailto:drbd@spam.dragonhold.org">drbd@spam.dragonhold.org

</a>&gt; wrote:</span><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"><br><br>----- Message from <a href="mailto:adrien@modulis.ca">adrien@modulis.ca

</a> ---------<br>&gt; -will NFS do the job? (knowing that there will not be simultaneous access of<br>&gt; the same data from different virtual servers). Note: the virtual server will<br>&gt; be relatively small 300Mo each, they will be stored in a folder not an image

<br>&gt; (kinda like chroot).<br>What virtualization method are you using?&nbsp;&nbsp;All the ones that really<br>separate things out (other than Zones) that I&#39;ve used require a block<br>device, not a directory....</blockquote>

<div><br><br>I&#39;m using vserver which is closer to freebsd chroot jailed than xen or vmware<br>vserver just &quot;chroot&quot; all the process of the vserver. All the virtual servers share the same kernel.<br>&nbsp;</div><br>

<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">NFS is able to do it, but your commodity hardware may not be able to<br>handle the throughput you&#39;re looking for using NFS - a lot depends on

<br>the traffic patterns, load, etc.&nbsp;&nbsp;Your network diagram also shows each<br>storage server as only having a single network link to the backbone -<br>which is probably not what you want.&nbsp;&nbsp;I&#39;d suggest a couple of changes:

<br><br>1. Use 2 &quot;partitions&quot; within the RAID1, and have both servers active<br>(nfs-virt-1 and nfs-virt-2 if you like) to maximize performance (only<br>in failover conditions will you have everything running off a single

<br>server.</blockquote><div><br>That&#39;s a good idea !<br>&nbsp;</div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">2. Use 4 network connections for the storage servers - a pair for the

<br>DRBD link and a pair for the front end connection.&nbsp;&nbsp;It removes the 2<br>SPoF (Single Point of Failure) that your diagram has there</blockquote><div><br>I can do it too.<br></div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

3. If you can afford it, I&#39;d use 4 network connections for the<br>virtualization servers 2.&nbsp;&nbsp;A pair to the backend storage and another<br>pair to the front end for user usage.<br><br>&gt; -will Heartbeat &quot;guarantee&quot; that failover his made transparently without

<br>&gt; human intervention ?<br>I use the redhat cluster suite rather than heartbeat, and NFS running<br>on that does this quite happily for me.&nbsp;&nbsp;I&#39;m using DRBD as the back<br>end for a shared LVM arrangement - this provides my storage for a DB,

<br>user home directories, mail server, etc.&nbsp;&nbsp;I&#39;m using the RH cluster<br>rather than heartbeat because it seems to have better options for<br>managing the services (e.g. most of the time I have 4 running on<br>serverA and 1 running on serverB - my VoIP stuff gets its own server

<br>normally)</blockquote><div><br><br>I will have a look at redhat cluster suite, but it seems more complicated to setup than heartbeat.<br></div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

&gt; -My physical virtualization servers will be diskless to simplify management<br>&gt; (only swap will be on a local partition), is it a bad idea - could it<br>&gt; decrease performance ?<br>How much of an affect this would have depends on the method of doing

<br>the virtualization as much as anything else.&nbsp;&nbsp;If the OS is primarily<br>&quot;passive&quot;, and therefore not accessed much, then this should be fine -<br>although if you&#39;re using local swap then it&#39;s almost as easy to have a

<br>really simple OS image on it - which could reduce your network<br>traffic.&nbsp;&nbsp;Most linux distributions allow for very easy/quick<br>provisioning, so you could even not bother with RAID on the servers.<br>I&#39;m using FAI to do my debian installs, and I can reinstall a node in

my cluster in approximately 4 minutes - not counting the DRBD resync.</blockquote><div> The goal of the pxe boot is to save me a KVM if I screw up a boot process. But I can easily do an automated network install on pxe.

<br>&nbsp;</div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">&gt; -Can DRDB save me the RAID 1 setup ? so that I can use RAID 0 and double my

<br>&gt; capacity without affecting nfs service in case of hard disk failure ?<br>Yes and no.&nbsp;&nbsp;You have 2 copies of the data, so the system could cope<br>with a failure - but you then have no extra redundancy at all.<br>Considering the time it&#39;ll take to rebuild a 750GB DRBD device (and

<br>the associated performance reduction), I think that the $100 or so<br>saving per disk just wouldn&#39;t be worth it.</blockquote><div><br><br>Good point, I haven&#39;t accounted for the DRDB rebuilding time ... <br></div>

<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">&gt; -Has anyone run software RAID 5 and DRDB, or the overhead is too important ? I&#39;ve done it, but personally really don&#39;t like RAID5 anyway...

<br>Considering the relative costs of disks and servers I&#39;d probably stick<br>with mirroring.&nbsp;&nbsp;50 virtual machines per physical only works out<br>(using your 300MB figure) as 150GB - so each pair of 750GB disks can<br>

handle approximately 5 full virtualization servers... The performance<br>of this ( 250 different machines fighting to access a single spindle)<br>is something that you&#39;ll need to check, it could easily blow up if the<br>

load is high enough</blockquote><div><br>The voip servers doesn&#39;t do to much IO (they are more cpu intensive)<br>&nbsp;</div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

&gt; -A another scenario would be to use the local disk (80Go) of each<br>&gt; virtualization servers (no more pxe or nfs) and have DRDB duplicate the<br>&gt; local disk to a same-size partition on a RAID 5 one server NAS. Do you think

<br>&gt; this second scenario would be better in terms of uptime ?<br>It is likely to be a much higher performance answer than using NFS as<br>the back end, but you need to think about failover to decide whether<br>it is better.&nbsp;&nbsp;If you are using the single pair of &quot;storage&quot; servers,

then if one of your virtualization servers dies you can evenly distribute the virtualized servers across the rest of the estate to wherever you want.&nbsp;&nbsp;If you mirror the local disk to a single back end, how do you bring up that virtual machine somewhere else?

<br><br>A more &quot;generic&quot; solution would be to totally decentralize the<br>solution - and automatically allocate virtual servers a primary and<br>secondary location.&nbsp;&nbsp;Autogenerate the DRBD configuration files from

<br>this pairing, and you can split the load more evenly.&nbsp;&nbsp;That gives you<br>a situation where losing a single virtualization server just slightly<br>increases the load everywhere without much additional effort - and<br>would remove the bottleneck of a small number of storage servers (

e.g. all virtualization servers talking to each other, rather than all of them talking to 1 or 2 backend stores).</blockquote><div> I have think about this first, but I think it would be much more configuration and script making.

<br>The other problem with this setup is that most of the server I can buy have scsi interface only, so I can&#39;t easily increase the disk capacity.<br>&nbsp;</div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

Using a very quick and simple model, I reckon that with 10 physical<br>servers, each with 9 virtual servers, this would give you:<br>0 failures = 90 available, all servers running 9 virtual machines<br>1 failure&nbsp;&nbsp;= 90 available, all servers running 10 virtual machines

<br>2 failures = 89 available, 7 servers running 11, 1 server running 12<br><br>Graham</blockquote><div><br><br>Thanks a lot for your feedback, <br>&nbsp;<br>Adrien<br></div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

_______________________________________________<br>drbd-user mailing list<br><a href="mailto:drbd-user@lists.linbit.com">drbd-user@lists.linbit.com</a><br><a href="http://lists.linbit.com/mailman/listinfo/drbd-user">http://lists.linbit.com/mailman/listinfo/drbd-user

</a><br><br></blockquote></div><br><br clear="all"><br>-- <br>Adrien Laurent<br>(514) 284-2020 x 202<br><a href="http://www.modulis-groupe.com">http://www.modulis-groupe.com</a>