> I'll quote the code, to clarify the intention for when
> "degr-wfc-timeout" is used:
>  /* If I am currently not Primary,
>   * but meta data primary indicator is set,
>   * I just now recover from a hard crash,
>   * and have been Primary before that crash.
>   *
>   * Now, if I had no connection before that crash
>   * (have been degraded Primary), chances are that
>   * I won't find my peer now either.
>   *
>   * In that case, and _only_ in that case,
>   * we use the degr-wfc-timeout instead of the default,
>   * so we can automatically recover from a crash of a
>   * degraded but active "cluster" after a certain timeout.
>   */
> ---------------------------------------
> </snip>
> Hmmm... why not always start both nodes as Secondary with no timeout,
> then let heartbeat force the right one to be Primary? Wouldn't that
> avoid blocking and split-brain?

thats why it is configurable...

just make sure that your heartbeat won't decide to make a node primary
that happens to have long-since outdated data
  cluster fine
  secondary crash [first spike of a brown out]
  time passes
  primary crash   [well, now its a real black out]

  ... [power back]

  previously secondary comes up, heartbeat decides to make it primary
     *** you are primary with outdated data ***
  previously primary needs a lot longer (recounts its scsi devices,
  thinks it needs to fsck its root, whatever)...
  same effect as split brain: diverging data sets.

> Is it possible to set up something like this...
> 	Cluster fine, connected.
> 	Pull plug from primary machine A.
> 	Heartbeat on B takes over, forces drbd on B to be primary.
> 	Plug machine A back in.
> 	Heartbeat on A detects it is secondary, forces drbd on A to go
> secondary.
> If this is a stupid question, forgive me. I am shameless about showing
> my ignorance. :-)

well, it depends what plug you pull,
and what you can teach heartbeat to detect.

