[DRBD-user] (off topic) alternative to drbd

Thu Feb 2 04:24:15 CET 2012

I apologize for the off topic discussion.  If there is a more general list
about the dozens of network storage options please point me in the right
direction.

I am testing various network redundancy options and failover options, and
throwing out an option for comment that I haven't really seen talked about
anywhere (here or elsewhere) as a viable configuration.  Only mentioning
it because I did some tests in the lab and it worked better than I
expected given the lack of it being mentioned as an option anywhere.  The
test consisted of two storage servers (I used NFS, but iSCSI, etc. should
work if someone wants to replicate), and a compute server that housed VM's
storage on the NFS servers.

Here is the twist.  instead of using drbd, or some other mechanism to
provide the redundant storage, I setup a software RAID 1 inside the guest
with two different virtual disks on different NFS servers.  After all
setup and running fine, I killed one of the NFS servers (technically just
blocked it with iptables) and the guest would freeze on some I/O for
awhile.  At that time, cached I/O was fine so it was fine if you stuck to
common directories, etc., and sessions to non cached data would freeze,
but you could open a new session just fine.  After a bit (maybe two
minutes, but probably configurable somewhere) the software RAID disabled
the failed drive and all was well within the vm to any processes.  The
only noticible problem was the momentary hang of a few processes until the
drive was marked as failed.

Made the NFS server available again, and I had to manually re-add the
failed device, but it quickly re-synced.  The software raid keeps a rough
track of the areas of the disk that changed.  Then I repeated the process
with the other NFS server to verify I could kill either NFS server without
significant downtime..

Pros:

No split brain risk, as the brain is in the VM instead of the storage
nodes.

Load balanced reads - very fast I/O when you have multi-processes reading.

Can have some VMs use just one server, and some use redundant storage
without complex preplanning or LVM changing, etc.

(In addition, not exactly a pro as common with most commodity hardware
options, however just noting it should be fine to have compute resources
and storage resources on same physical box.  If you do, you can optionally
give local disk read priority in the software raid if desired.)

Cons:

More work in setup of the guest instead of the upfront extra work in the
setup of storage and STONITH.  (Not that bad, and that's what templates
are for).

Not sure how difficult booting would be with one of the storage units
down.  Having disks go away and come back while running seems fine, but
there may be extra work to force a guest online with one of the storage
devices down.

Did not autorecover.  a reboot probably would easy-autorecover but
assuming you want 0 downtime...  You just have to make sure you are setup
to monitor /proc/mdstat or similar to detect a a broken raid so you know
it needs attention...

Could get complicated for automatic recovery if you want the compute side
failover/redundant at the same time, and want things to work smoothly when
one of the storage nodes is down at the same time.  Shouldn't be too bad
if you are ok with either a storage node going down, or a compute node
going down, but if 2 of the 4 go down at the same time manual intervention
may be required if you can't get the other storage unit back online.

At this point I am not sure I would recommend/use it over drbd or any of
the various cluster filesystems, etc.  just that it did test out well
enough that I am at least considering it, given that most of my servers I
don't need the redundant network storage (maybe 3%) beyond what's built
into the boxes, as the majority of our servers are active/active with
redundant failover loadbalancers in front of them, or active/passive with
sync'd configs, or are simply not critical 24/7/365.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20120201/76ee39a1/attachment.htm>