Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
> re: paranoid: > He is effectively trying to do a RAID10. That is in theory the most > reliable of the basic RAID levels. (better than raid3/4/5 for sure. I > don't know about raid6, raid50, or raid60.) In all cases raid is only > reliable if the system is well monitored and failed disks are rapidly > replaced. ie. if you leave a disk in a failed mode for a month you > have a huge window of vulnerability for a second disk crash bringing > down the whole raid. The second disk failure would have to be the partner to the first (as you say later), but yes I agree. With 4+ disks, raid10 is likely to be one of the better options. You could go down the route of raid 15 if you wanted to be eben more sure - but that's starting to get silly for almost all uses. 3 way mirroring would be more sensible (isn't there something about >2 nodes in DRBD in the roadmap? *grin*) > Specifically, he wants to stripe together 16 mirror pairs. Each > mirror pair should be extremely reliably if the failed drive is > rapidly detected, replaced, and resync'ed. The RAID10 setup would be > 1/16th as reliable, but in theory that should still be very good. Striping the mirrored pairs is certainly the most sensible (under the majority of circumstances) of the stacked RAID options. In the case of solaris you configure the system as raid 0+1, but it actually runs as raid 1+0 underneath (DiskSuite/LVM/whatever it is called this week). > re: MD not cluster aware. > I'm assuming the OP wants to have MD itself managed by heartbeat in a > Active / Passive setup. If so, you only have one instance of MD > running at a time. MD maintains all of its meta-data on the > underlying disks I believe, so drbd should be replicating the drbd > meta-data between the drives as required. If you're running in active/passive, and are willing to do all of the logic in your own scripts, then there's no reason why this wouldn't work - hell you could run LVM on top of your MD device quite happily. DRBD will keep the block devices in sync, and you can do the rest yourself. As long you take the appropriate precautions (make sure the MD devices are access SOLELY through drbd, make sure that they are NOT auto-initialised/started, etc) this should be fine. However, the OP was talking about running GFS on the DRBD device, and that only really makes sense if you're going dual active. > If you have a complete disk failure, drbd should initiate i/o shipping > to the alternate disk, right? So the potential exists to have a > functioning RAID10 even in the presence of 16 disk failures. (ie. > exactly one failure from each of the mirror pairs.) OTOH, if you lose > both disks from any one pair, the whole raid10 is failed. I've had a dual primary setup with one of the mirrors out of sync, so it can certainly manage that. If you're looking at this much data, you're going to be looking to source the disks from different manufacturers (ideally) or at least different batches. Not all the stories about batches all failing at the same time are Urban Myths. > What happens (from a drbd perspective) if you only have sector level > failures on a disk? That's a configuration option within DRBD itself - what to do on I/O errors. Most people (I would have thought) are going to end up using the option to detach from the local copy, and serve the data from the remote version. With adequate monitoring/hotswap, this can then be fixed and re-synced. I believe, hardware willing, totally transparently to the higher levels of the system. The only reasons why this solution would be a problem are if the design is supposed to be dual active - or the person setting it up didn't cover their options properly. Graham