[Drbd-dev] Re: drbd Frage zu secondary vs primary; drbddisk status problem
Philipp Reisner
philipp.reisner@linbit.com
Fri, 20 Aug 2004 14:52:52 +0200
On Thursday 19 August 2004 14:14, Lars Ellenberg wrote:
[...]
> > Split-brain Szenarien die mit Primary/Primary (beide StandAlone) enden
> > habe ich schon im neuen Design bedacht (ich schreibe gerade). Was sonst?
>
> gar nicht soo unwahrscheinlich:
>
> wenn der primary stirbt (oder getötet wird), aber vor dem sterben
> irgendwie noch geschafft hat, seine drbd connection zu verlieren _und_
> daher den "ConnectedCount" hochgezählt hat...
>
> der "slave" wird jetzt Secondary->Primary, zählt aber, weil < Connected
> den ArbitraryCount hoch...
>
> situation beim nächsten connect:
>
> Flags: consistent, ,been primary last time
>
> früherer Primary 1:X:Y:a+1:b :10 (nach reboot jetzt Secondary)
> jetziger Primary 1:X:Y:a :b+1:10
>
> doh. jetziger Primary soll SyncTarget werden... shitty.
> --> jetziger Primary goes StandAlone.
>
> nächster verbindungsversuch (von operator eingeleitet)
> ... -> "split brain detected"
> --> both go StandAlone
>
> u.U. müssen wir einen zusätzlichen counter einführen, einen "CRM
> count", und der CRM muss, wenn er den anderen node geschossen hat,
> sicherheitshalber ein drbdsetup "--crm" (vgl. --human) primary
> machen, dass würde zumindest das oben beschriebene scenario auflösen...
>
Hi,
Right, old toppic: What should we do after a split-brain situation.
I have looked up my papers from 2001 to unterstand, why it is done
the way it is today:
The situation:
N1 N2
P --- S Everything ok.
P - - S Link breaks.
P - - P A (also split-brained) Cluster-mgr makes N2 primary too.
X X Both nodes down.
P --- S The current behaviour.
What should be done after Split brain ?
The current policy is, that the node that was Primary before the
split-brain situation should be primary afterwards.
This Policy is hard-coded into DRBD. It is an arbitrary decission,
I thought it is a good idea.
The question are:
Should this policy be configurable ? (IMO: yes)
Which policies do we want to offer ?
* The node that was primary before split brain (current behaviour)
* The node that becaume primary during split brain
* The node that modified more of it's data during the split-brain
situation [ Do not think about implementation yet, just about
the policy ]
* others ?...
The second question to answer is:
What should we do if the connecting network heals ? I.e.
N1 N2
P --- S Everything ok.
P - - S Link breaks.
P - - P A (also split-brained) Cluster-mgr makes N2 primary too.
? --- ? What now ?
Current policy: The two nodes will refuse to connect. The administrator
has to resove this.
Are there any other policies that would make sense ?
-Philipp
--
: Dipl-Ing Philipp Reisner Tel +43-1-8178292-50 :
: LINBIT Information Technologies GmbH Fax +43-1-8178292-82 :
: Schönbrunnerstr 244, 1120 Vienna, Austria http://www.linbit.com :