[DRBD-user] raid recovery and drbd

Mon Jul 11 01:58:09 CEST 2016

On 10/07/16 04:34, Mariusz Mazur wrote:
> Raid 5 is dead because on multi-terabyte discs I'm almost guaranteed
> some unrecoverable sectors, which results in data loss. (With
> sufficiently large disks this is also a risk on a two disk raid1.)
I'd disagree with your ideas of RAID1 being a problem. The chance of a 
URE on the *same sector* on *both* drives in RAID1 *at the same time* is 
infinitesimally small.
Also, assuming you are using RAID1 with only two drives, you can use 3 
mirrors, or more...

However, yes, you do potentially have a problem with one drive failed, 
and trying to resync from the second drive, and having a URE.
> Thing is, I shouldn't really care when I'm using drbd for everything.
> My best case scenario would be:
>
> 1. A drive in a raid fails.
> 2. I put in a new drive.
> 3. I tell mdadm to rebuild the array, but not abort on unrecoverable
> sectors, just zero them out and log them in a usable manner.
How do you do that?
Lets assume you have a 4 drive RAID5 array, one drive died, and is being 
replaced, and you now have a URE on one of the three active devices.
RAID5 can't recover the data, because it only has 2 out of 4 pieces of 
the data.
Will it mark the sector bad on the disk with the URE and return an 
error, or will it fail the entire array?
Please feel free to test this (perhaps with some test devices and/or in 
a VM).
> 4. I tell drbd to refetch data from these here sectors as provided by mdadm.
How do you do that? Assuming mdadm has simple "marked" the sector as 
un-readable, you could:
1) Do a write to that equivalent sector in DRBD, and DRBD will then 
attempt to write it, and mdadm can now ask the drive to write the new 
data to the sector, and the drive will either succeed, remap it, or fail 
and mdadm will add it to the badblocks list (assuming you have that 
enabled).
2) Tell DRBD to do a verify, wait for it to notice the data on that 
sector is different, and then do a resync from the "other source" (make 
sure you do the verify in the right direction, as DRBD itself doesn't 
know which source is correct).
3) Just invalidate the entire array on this node, and let DRBD resync 
from the other node (without the failed drive).
> 5. Array fully back up, no data missing.
>
> Am I correct that neither mdadm nor drbd are anywhere near supporting
> this workflow?

My understanding is that DRBD + RAID5 is equivalent to RAID51 which is 
worse than RAID15.
Potentially, you could use those same 4 drives, create 4 DRBD resources, 
then use those 4 resources to create a RAID5 array.
Not entirely sure how well that would work, but I expect it would work 
as RAID15 would, potentially providing a more resilient result.
If one disk failed, then you would simply replace it, and resync from 
the other node, if you encounter a URE, then I'm not sure what DRBD will 
do... probably best to test this scenario. If DRBD will just pass the 
URE up the chain, then you can do a RAID5 verify/sync which will 
re-construct the data from the other drives, re-write the data to DRBD, 
which will re-write to the disks, and generally succeed/remap the sector...

It sounds like an interesting area to run some tests. There is a block 
driver which will insert errors, you might find that useful in 
generating the "URE"...

I'd be quite interested in hearing the results of your tests.

Regards,
Adam

-- 
Adam Goryachev Website Managers www.websitemanagers.com.au