Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
On 10/07/16 04:34, Mariusz Mazur wrote: > Raid 5 is dead because on multi-terabyte discs I'm almost guaranteed > some unrecoverable sectors, which results in data loss. (With > sufficiently large disks this is also a risk on a two disk raid1.) I'd disagree with your ideas of RAID1 being a problem. The chance of a URE on the *same sector* on *both* drives in RAID1 *at the same time* is infinitesimally small. Also, assuming you are using RAID1 with only two drives, you can use 3 mirrors, or more... However, yes, you do potentially have a problem with one drive failed, and trying to resync from the second drive, and having a URE. > Thing is, I shouldn't really care when I'm using drbd for everything. > My best case scenario would be: > > 1. A drive in a raid fails. > 2. I put in a new drive. > 3. I tell mdadm to rebuild the array, but not abort on unrecoverable > sectors, just zero them out and log them in a usable manner. How do you do that? Lets assume you have a 4 drive RAID5 array, one drive died, and is being replaced, and you now have a URE on one of the three active devices. RAID5 can't recover the data, because it only has 2 out of 4 pieces of the data. Will it mark the sector bad on the disk with the URE and return an error, or will it fail the entire array? Please feel free to test this (perhaps with some test devices and/or in a VM). > 4. I tell drbd to refetch data from these here sectors as provided by mdadm. How do you do that? Assuming mdadm has simple "marked" the sector as un-readable, you could: 1) Do a write to that equivalent sector in DRBD, and DRBD will then attempt to write it, and mdadm can now ask the drive to write the new data to the sector, and the drive will either succeed, remap it, or fail and mdadm will add it to the badblocks list (assuming you have that enabled). 2) Tell DRBD to do a verify, wait for it to notice the data on that sector is different, and then do a resync from the "other source" (make sure you do the verify in the right direction, as DRBD itself doesn't know which source is correct). 3) Just invalidate the entire array on this node, and let DRBD resync from the other node (without the failed drive). > 5. Array fully back up, no data missing. > > Am I correct that neither mdadm nor drbd are anywhere near supporting > this workflow? My understanding is that DRBD + RAID5 is equivalent to RAID51 which is worse than RAID15. Potentially, you could use those same 4 drives, create 4 DRBD resources, then use those 4 resources to create a RAID5 array. Not entirely sure how well that would work, but I expect it would work as RAID15 would, potentially providing a more resilient result. If one disk failed, then you would simply replace it, and resync from the other node, if you encounter a URE, then I'm not sure what DRBD will do... probably best to test this scenario. If DRBD will just pass the URE up the chain, then you can do a RAID5 verify/sync which will re-construct the data from the other drives, re-write the data to DRBD, which will re-write to the disks, and generally succeed/remap the sector... It sounds like an interesting area to run some tests. There is a block driver which will insert errors, you might find that useful in generating the "URE"... I'd be quite interested in hearing the results of your tests. Regards, Adam -- Adam Goryachev Website Managers www.websitemanagers.com.au