Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
I'm currently in the process of implementing a a new storage cluster for my employer. We're looking at about 30TB of mirrored Storage on two storage servers. The intention is to expand this sometime in the near future, so it could be that in one or two years we're looking at 40-60TB of storage. Our storage is meant for simple storage space of files which are in the range of 10-250MB on average. The idea is to run the DRBD set up in dual-primary node, so that uploads to one or the other node are synchronised and the data set is always consistent. Right now we are more concerned with redundancy rather than load balancing. But I figure that a load balanced set up now will save us some headaches in the future, once capacities extend the capabilities of a single node. It would greatly simplify things if I could set up the 30TB as a single volume mounted as /storage on the server. Of course, here is where I'm running into some partition size limits on various ends. My intention was to use the OCFS2 filesystem. But here I ran into my first problem. If I use 4k blocks then the maximum partition size is 16TB. Unless I use the 64bit Journal option, OR I use a different blocksize. So I have a couple of questions and observations before I decide for a definite path. 1. First of all, am I on the right path? Is there perhaps a different approach I should be using? 2. Are there any drawbacks to creating a filesystem with lets say 1M blocks? 3. I need to create everything in such a way, that I can seamlessly extend the storage later on. e.g. simply resize the 30TB volume, without having to reformat everything. Since, as you can imagine, shuffling around 30TBs of data takes forever, no matter how much bandwidth you have. 4. Has anyone had experience with running a storage cluster of this size? e.g. over 16TB. 5. Most tutorials I've seen on DRBD, suggest going with OCFS2. Is there any inherent advantage to OCFS2 or GFS? 6. Is a third "brain" node necessary for my setup? Each of our storage nodes serves data to one of two loadbalanced apache servers. As such the "loadbalancing" is kind of automatic, at least as far as web requests are concerned. 7. Please post and further suggestions you might have.