[DRBD-user] Recovery from split-brain condition, please advice.
adam at linbit.com
Mon Nov 16 20:41:44 CET 2009
> 1. # umount block devices
Only needed on the split-brain victim (secondary in this case) if your
CRM's brain also split and you found drbd promoted and filesystem mounted.
> 2. # disconnect all resources on both nodes
> $ drbdadm disconnect all
Not needed on primary since it already is disconnected (StandAlone)
> 3. # force both nodes to be secondary
> $ drbdadm secondary all
Again, only needed on the victim if you found it promoted.
> 4. # select slave drive and tell it to drop all data
> $ drbdadm -- --discard-my-data connect resource
> to force all resources on the secondary node ( bad ) to be secondary
> and to drop all date.
It already is secondary and will reconnect to its peer and attempt to
sync up what data is needed to get back UpToDate
> 5. # select source and master mode and start synchronisation.
> $ drbdadm -- --overwrite-data-of-peer primary resource
This will initiate a FULL resync. Not needed, just reconnect and begin
> 6. # Start synchronisation on the source ( master ) node
> drbdadm connect resource
> I would greatly appreciate if you can answer my questions.
> 1. Any comments on the procedure?
> 2. How do I know if --discard-my-date option is necessary ?
One node has outdated data. This will designate that node as the victim.
> 3. After DRBD starts process of synchronisation, can I mount block
> devises on the master node, or do I have to wait until synchronisation
> is completed?
You shouldn't need to unmount, demote or otherwise stop services on the
primary during any of this.
Also, look into notify-split-brain.sh and crm-fence-peer.sh or dopd.
Adam Gandelman - 503-573-1262 x203
LINBIT - Your Way to High Availability
8152 SW Hall Blvd., Suite #209 : Beaverton, OR 97008
More information about the drbd-user