[DRBD-user] Expanding a cluster

Thu Jan 31 18:27:28 CET 2013

On 01/02/13 04:04, Justin Edmands wrote:
> I'm on the fence about the amount of time it will take to degrade and
> rebuild a RAID6 at 16 drives (x2 systems). 
> 
> Anyone against the idea of:
> Backup data friday night through saturday morning
> stop drbd and heartbeat on node2
> replace all drives on node2
> build raid 6 and match setup/sizes from node1
> initialize metadata, etc.
> start drbd and heartbeat
> let it sync
> make node2 primary
> repeat steps for node1

In theory, the set of drives you pulled from the secondary are an extra
backup.... you could put all those drives back in, and make that set the
primary.... In some ways this might be a better solution, since you are
then simply doing a single large read on the primary, and a large write
on the secondary.... no raid rebuilds, except for the initial resync on
the secondary (which you might be able to skip since you know you will
write to every sector very soon when drbd does the sync).

1) Stop DRBD on secondary
2) Pull all drives on secondary
3) Add all drives on secondary and build new RAID6 array
4) Enable DRBD on secondary
5) sync from primary to secondary

Danger of read errors on the primary during this sync, but I would guess
this is better than doing 16 rebuild's

Personally, I would try to set the primary read-only during the process
(if an option) so that the "spare" set of drives is an exact match to
the primary (ie, they don't get outdated).

Depends on how much downtime can be scheduled....

Finally, I think you have a fairly high risk with 16 drives in a single
RAID6, you might consider 2 sets of 8 drives in RAID6, and do a linear
concat of the two sets (or raid0). That allows you to lose any 2 out of
8 drives, instead of only 2 out of 16. Also, chances of URE on just one
of the remaining 14 drives after a 2 drive failure is not a good risk I
would want. Though depends on capacity requirements if you can use
another 2 drives to ensure you don't lose the data.

Just my 0.02c worth....

At the end of the day, the direct answer to the original question was
RTFM, it really is a very nice manual, and you didn't tell us what
version of DRBD you use. The rest is really off-topic for this list,
maybe discuss on the linux-raid list if you are interested.

Regards,
Adam

> On Thu, Jan 31, 2013 at 11:20 AM, Adam Goryachev
> <mailinglists at websitemanagers.com.au
> <mailto:mailinglists at websitemanagers.com.au>> wrote:
> 
>     On 01/02/13 02:58, Marcelo Pereira wrote:
>>     Hello Everyone,
>>
>>     I'm about to perform an upgrade on my servers and I was wondering
>>     how to do that.
>>
>>     Here is the scenario:
>>
>>     Server A has 16x 1Tb hard drives, under RAID-6.
>>     Server B has 16x 1Tb hard drives, under RAID-6.
>>
>>     And both are in sync, using DRBD.
>>
>>     I though about replacing the hard drives for 2Tb units, one by one.
>>
>>     So, on each run, I would:
>>
>>       * Remove a 1Tb disk
>>       * Add a 2Tb disk
>>       * Wait for it to rebuild the RAID
>>
>>     After replacing ALL disks, I would expand the RAID unit, on each
>>     server.
>>
>>     However, I was wondering how DRBD would "like" this procedure.
>>
>>     I know that, before "expanding" the RAID, the cluster size, and
>>     the block numbers would remain the same, as I would be "wasting"
>>     the extra space on the newly added drives.
>>
>>     So, after "both" servers have all the drives replaces, and the
>>     RAID is properly rebuild. Would that be a problem to expand it?
>>     How would DRBD handle it?
>>
>>     I will appreciate any comment or suggestion here.
>     DRBD will work perfectly...
> 
>     You probably need to do the following:
>     1) Pull one drive and replace (you could do one on each server at
>     the same time, although better/safer to do one server at a time)
>     2) Wait for rebuild to complete
>     3) Repeat for all disks on BOTH servers
>     4) Resize the RAID array on each server
>     5) Resize DRBD (see the fantastic online manual for your version of
>     DRBD for the details)
>     6) Resize the underlying filesystem or whatever
> 
>     BTW, depending on your kernel version, and/or RAID (I'm assuming
>     linux software raid), you might like to query the linux-raid list to
>     see if you can ADD the new drive, tell md that this new drive is
>     replacing drive X, this way you avoid degrading the RAID array,
>     hence lose less performance during the rebuild, and have a lower
>     risk of disk failure and especially URE (Unrecoverable Read Error)
>     during the rebuilds.
> 
>     Regards,
>     Adam
> 
> 
>     -- 
>     Adam Goryachev
>     Website Managers
>     www.websitemanagers.com.au <http://www.websitemanagers.com.au>
> 
> 
>     _______________________________________________
>     drbd-user mailing list
>     drbd-user at lists.linbit.com <mailto:drbd-user at lists.linbit.com>
>     http://lists.linbit.com/mailman/listinfo/drbd-user
> 
> 

-- 
Adam Goryachev
Website Managers
www.websitemanagers.com.au