[DRBD-user] drbd8 and 80+ 1TB mirrors/cluster, can it be done?

Christian Balzer chibi at gol.com
Wed May 28 08:36:33 CEST 2008

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


On Tue, 27 May 2008 09:17:53 -0700 Tim Nufire wrote:
> 
> My business depends are being able to deliver lots of reliable storage  
> and the lowest possible cost. In terms of capital costs, this means  
> something like $0.50 per formatted GB. I looked at servers like the  
> one you mention above but couldn't make the numbers work :-/ Instead,  
> I'm using enclosures like:
> 
> http://www.addonics.com/products/raid_system/rack_overview.asp and
> http://www.addonics.com/products/raid_system/mst4.asp
> 
Those don't seem to wind up being all the much cheaper, given the 
density (is your rack space "free"?) and the lack of hot-swap (or
at least swap w/o having to dismantle the box) ability.

At least I'm not suggesting that you should get a "Thumper", but I'm sure
for some people that is the ideal and most economic solution.
(http://www.sun.com/servers/x64/x4500/)

I'm always aiming for the cheapest possible solution as well, but 
this is always tempered by the reliability and how serviceable the
result needs to be.
  
> >> I am running Debian "Etch" 4.0r3 i386, drbd v8.0.12, Heartbeat  
> >> v2.0.7,
> >> LVM v2.02.07 and mdadm v2.5.6. I'm using MD to assigned stable device
> >> names (RAID1 with just 1 disk, /dev/md0, /dev/md1, etc...), drbd8 to
> >> mirror the disks 1-1 to the second server and then lvm to create a  
> >> few
> >> logical volumes on top of the drbd8 mirrors.
> >>
> > You might want to check out Debian backports for more up to date  
> > packages
> > where it matters, like heartbeat.
> 
> This was suggested on the heartbeat mailing list as well.... What is  
> the most stable/recommended heartbeat release to use?
>
If you don't feel like rolling your own, the latest backport one.
 
> > And you really want to use a resilient approach on the lowest level,  
> > in
> > your case RAID5 over the whole disks (with one spare at least given  
> > that
> > with 80 drives per node you are bound to have disk failures frequently
> > enough). In fact I'd question the need for having a DRBD mirror of
> > archive backups in the first place, but that is your call and money.  
> > ^^
> 
> Unfortunately, cost is driving most of my decisions and RAID5 adds  
> 10-20% to the total cost. 
Come again? I was suggesting an overhead of 2 drives, which comes to 2.5%
with 80 drives. Other than that RAID5 is free (md driver) and you sure 
were not holding back with CPU power in your specs (less, but faster 
cores and most likely Opteron instead of Intel would do better in 
this scenario). Of course I have no idea how many SATA (scsi really)
drives current kernels can handle and how many drives can be part of a 
software RAID.

>I'm using DRBD in part because it both  
> replicates data and provides high-availability for the servers/ 
> services. I'll have some spare drives racked and powered so when  
> drives go bad I can just re-mirror to a good drive leaving the dead  
> device in the rack indefinitely.
>
Er, if a drive in your proposed setup dies that volume becomes corrupted
and you will have to fail over to the other DRBD node and re-sync. One
does not design a HA system in a way that fail-overs (which cause a however
brief service interruptions) are a normal occurrence. This should only
happen when every other possible layer of redundancy and resilience has
failed (or if you are doing it intentionally for like a rolling kernel
upgrade). And you better pray very hard that the drive on the other node
doesn't take the re-sync timeframe to shuffle it's mortal coil, too. 
I'd strongly suggest something like:
2x39 drives RAID5 on each node, 2 spares in the same spare-group so mdadm
can move and use them as needed (giving you a whooping 5% overhead compared
to your scheme). 
Generate DRBD A and B from those and have A active by default on node 1
and B on node 2, thus optimizing things in regards to caching from your
ample RAM that you will supply. ;)
I'm doing this with 8 drives per node and it works like a charm.

> Has anyone else tried to do something like this? How many drives can  
> DRBD handle? How much total storage? If I'm the first then I'm  
> guessing drive failures will be the least of my issues :-/
> 
> 
If you get this all worked out, drive failures and the ability to 
address them in an automatic and efficient manner will be the issue for
most of the lifetime of this project. ^_-

Regards,

Christian
-- 
Christian Balzer        Network/Systems Engineer                NOC
chibi at gol.com   	Global OnLine Japan/Fusion Network Services
http://www.gol.com/



More information about the drbd-user mailing list