[DRBD-user] OCFS2/GFS+DRBD for HUGE Partition 30TB.

Sat Sep 10 00:00:06 CEST 2011

Thank you for your reply. Perhaps I should clarify a few details about
our setup. I might have awoken the wrong impression.

We have to loadbalanced Web Servers. These would be the clients. Or
better said, Web1 would mount Storage1 and Web2 would mount Storage2.
Web1 and Web2 are auto loadbalanced by our hosting provider. That is to
say, we don't control that aspect of it, yet.

Web1 and Web2 serve our website with a 1Gbit Interface each.
Our two new Storage Systems however, have a 10Gbit interface.

Currently we have more than enough resources to serve our data to our
webservers. That is to say, as far as our current storage systems are
concerned, we have no problems with cpu usage, ram usage, or even I/O or
network bandwidth.

However, we are running out of space.

Our current setup (which was done before I joined the company) is really
braindead. Storage 1 and Storage2 keep in sync via rsync which gets run
via a php script as soon as a file gets uploaded.
Also, if one of our developers uploads new code for the site, it gets
upload to one node, and then replicated via rsync  in cronjob to the others.

Needless to say, this situation is far from ideal. When uploading a file
to one of the storage nodes, the php script takes the name and directory
of that file and syncs it directly via rsync. If for some reason this
fails, then we got inconsistent data.

Sometimes we notice and rsync the entire storage volume manually from
one node to the other. But as you can imagine, with 13TB, that can take
a while. Especially since its a LOT of files and quite a few subdirectories.

Anyways, my idea was to use DRBD and OCFS2 to create a more elegant
setup so that we only have to worry about one central storage area. None
of this rsyncing all over the place.

Anyways, so yeah a parallelized application of sorts is what we have.
We intend to upgrade our webservers in the near future of course. Then
we will be using machines with 10gbit interfaces to maximize throughput.
But first we wanted to iron out some headaches we've had with our
storage system.

Now as far as expanding storage space is concerned. This would be done
through the addition of Direct Attached Storage. Our server provider
offers this. So in terms of more physical storage added to the existing
nodes, we've got that covered. I'm more worried about size limits of
DRBD and OCFS2.
Like I mentioned, moving this many terabytes of data around, takes a
while, even with 10gbit. It would be quite impractical to have to
restart from scratch every time I want to extend the file space.

Thats what's nice about LVM for example. I can have multiple raid arrays
underneath, just add it to my volume group, expand the respective
Logical Volume, and of course the DRBD block device thats on top of it.
On the fly of course. From what I can tell, OCFS2 does this as well.

So thats all cool. Like I said, I'm a bit worried when it comes to
filesize limits. For example using OCFS2 with a 4k blocksize results in
a limit of 16TB, at least when using a 32bit journal.

Ok, so I can format OCFS2 using a different blocksize, but then that
probably means I have to choose well, based upon the size of Storage
that we imagine we will grow to. Or is it possible to change the
blocksize of OCFS2 afterwards, without having to reformat everything?

Oh by the way. You said that 2 nodes pumping 30TB of data would get
clobbered really hard. Well, true if the entire 30TB of data would be
accessed all the time, which it isn't. Like I said, our capacities in
terms of bandwidth exceed our needs, and will do so for quite a while.
Actually, I think that our transferred volume of data currently averages
at 500GB per day.