[DRBD-user] Recovery from split-brain condition, please advice.

Adam Gandelman adam at linbit.com
Mon Nov 16 20:41:44 CET 2009


Ivan wrote:
> 1. # umount block devices
Only needed on the split-brain victim (secondary in this case) if your
CRM's brain also split and you found drbd  promoted and filesystem mounted.
> 2. # disconnect all resources on both nodes
> $ drbdadm disconnect all
>   
Not needed on primary since it already is disconnected (StandAlone)
> 3. # force both nodes to be secondary
> $ drbdadm secondary all
>   
Again, only needed on the victim if you found it promoted.
> 4. # select slave drive and tell it to drop all data
> $ drbdadm -- --discard-my-data connect resource
> to force all resources on the secondary node ( bad ) to be secondary
> and to drop all date.
>   
It already is secondary and will reconnect to its peer and attempt to
sync up what data is needed to get back UpToDate
> 5. # select source and master mode and start synchronisation.
> $ drbdadm -- --overwrite-data-of-peer primary resource
>   
This will initiate a FULL resync.  Not needed, just reconnect and begin
resync.
> 6. # Start synchronisation on the source ( master ) node
> drbdadm connect resource
>
>
> I would greatly appreciate if you can answer my questions.
>
> 1. Any comments on the procedure?
>
> 2. How do I know if --discard-my-date option is necessary ?
>
>   
One node has outdated data.  This will designate that node as the victim.
> 3. After DRBD starts process of synchronisation, can I mount block
> devises on the master node, or do I have to wait until synchronisation
> is completed?
>
>   
You shouldn't need to unmount, demote or otherwise stop services on the
primary during any of this.

Also, look into notify-split-brain.sh and crm-fence-peer.sh or dopd.

-- 
Adam Gandelman - 503-573-1262 x203
LINBIT - Your Way to High Availability
8152 SW Hall Blvd., Suite #209 : Beaverton, OR 97008

http://www.linbit.com



More information about the drbd-user mailing list