[DRBD-user] Recovery from split-brain condition, please advice.
Ivan
ivan.teliatnikov at gmail.com
Mon Nov 16 15:06:31 CET 2009
Hello everyone!
I am new to DRBD and to this list. I recently picked up HA + drbd 2
node cluster that suffered split-brain condition over 6 months ago.
During this time the healthy node continued to work as a file server,
whilst the second node has both HA and drpd turned off.
Primary node: ( working in production )
Secondary node: rubble ( has being off-line for 6 motnhs )
------------- state, dstate, cstate of primary node --------------------
[root at flintstone ~]# drbdadm state all
Primary/Unknown
[root at flintstone ~]# drbdadm dstate all
UpToDate/DUnknown
[root at flintstone ~]# drbdadm cstate all
StandAlone
------------- state, dstate, cstate of secondary ( not working ) node
--------------------
[root at rubble init.d]# drbdadm state all
Secondary/Unknown
[root at rubble init.d]# drbdadm dstate all
UpToDate/DUnknown
[root at rubble ~]# drbdadm cstate all
WFConnection
As far as I understand a recovery steps below will guaranty recovery
from split-brain condition.
1. # umount block devices
2. # disconnect all resources on both nodes
$ drbdadm disconnect all
3. # force both nodes to be secondary
$ drbdadm secondary all
4. # select slave drive and tell it to drop all data
$ drbdadm -- --discard-my-data connect resource
to force all resources on the secondary node ( bad ) to be secondary
and to drop all date.
5. # select source and master mode and start synchronisation.
$ drbdadm -- --overwrite-data-of-peer primary resource
6. # Start synchronisation on the source ( master ) node
drbdadm connect resource
I would greatly appreciate if you can answer my questions.
1. Any comments on the procedure?
2. How do I know if --discard-my-date option is necessary ?
3. I wonder if "--" is required after drbdamin? It is mentioned in the
on-line version of DRBD User's guide, whilst man file for drbdadm does
not mention it.
3. After DRBD starts process of synchronisation, can I mount block
devises on the master node, or do I have to wait until synchronisation
is completed?
Thank you very much for your help.
Ivan
More information about the drbd-user
mailing list