[DRBD-user] fail-over using duplicate ROOT file system

Ian! D. Allen idallen at idallen.ca
Sun Jul 17 21:09:16 CEST 2011

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Is this dual-ROOT scheme below workable?

Let me explain using Machine A and Machine B:

Machine A
 - has a stock, ext4, bootable ROOT file system, non-DRBD
 - has a second ROOT partition that is an RSYNC copy of Machine B
 - has an ext4 data-only partition, DRBD primary, that uses
   Machine B as DRBD secondary

Machine B
 - has a stock, ext4, bootable ROOT file system, non-DRBD
 - has a second ROOT partition that is an RSYNC copy of Machine A
 - has a (hidden) data-only partition that is DRBD secondary, with
   Machine A as DRBD primary.  (Hidden, because you can only mount ext4
   on one machine, not two.)

Machine A does an RSYNC of its ROOT partition to the corresponding spare
ROOT partition on Machine B regularly, and vice-versa.

When Machine A fails, reboot the Machine B hardware (DRBD secondary) using
the RSYNC copy of Machine A's ROOT partition.  This brings the machine
up solo as if it were Machine A, accessing the local DRBD partition as
if it were Machine A (DRBD primary).  (Perhaps some fsck will be needed
to make the ext4 on the DRBD usable?)  Is this doable?  (I know it works,
since I've tried it, but I'm wondering if I'm missing anything.)

Work continues using Machine B's hardware, booted to be Machine A
(DRBD primary).  This is now the new Machine A (DRBD primary), running
without any Machine B (DRBD secondary).

Repair the failed old Machine A.  Boot the old Machine A hardware using
the copy of Machine B's ROOT partition.  This brings the machine up
as if it were Machine B, partnering with the new Machine A to provide
secondary DRBD for the data-only partition.  This is now the new Machine B
(DRBD secondary).

It it okay to bring up Machine B's secondary DRBD as primary, simply by
using the copy of the ROOT from Machine A?  It it okay to make Machine
A behave as Machine B, simply by using a copy of Machine B's ROOT?

One "gotcha" I've found is that booting Machine A's ROOT on Machine
B's hardware leads to mis-numbering of the Machine B network
cards as eth2 and eth3 due to Machine A entries already existing in
/etc/udev/rules.d/70-persistent-net.rules.  I'll have to tweak those at
boot time, or perhaps I can create a merged file that works correctly
on both machines.

-- 
| Ian! D. Allen  -  idallen at idallen.ca  -  Ottawa, Ontario, Canada
| Home Page: http://idallen.com/   Contact Improv: http://contactimprov.ca/
| College professor (Free/Libre GNU+Linux) at: http://teaching.idallen.com/
| Defend digital freedom:  http://eff.org/  and have fun:  http://fools.ca/



More information about the drbd-user mailing list