[DRBD-user] active-active over long distance (split-brains)

José Román Bilbao jrbcast at gmail.com
Thu Feb 9 19:39:02 CET 2012

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.

Hi all,

We have this scenario:

- A datacenter
 - 1 Server running kvm VMs:
    * 1 Openfiler to distribute hard disk to other VMs (stores VMs and
VMs's data) and other uses
    * Other VMs...
 - DRBD configured to replicate Openfiler's volumes as primary

- B datacenter
  - 1 Server running kvm as backup for center A failure events
  - DRBD configured to serve as backup of Openfiler so VMs can be restarted
in case of failure with updated data. It also works as primary.

Both centers are connected through wireless connections (500 Mb/s) which is
good for our requirements. Nevertheless, there is a ping of 300 ms because
of multiple routers in whithin...

We have been experiencing multiple split-brain situations and we don't know
why... perhaps the link is down for a while but I don't understand the
source of the problem as although they are primary-primary, one of the
servers should never write to disk as all machines are kept on datacenter
A, is this assumption right?, is this split-brain just "conceptual" telling
that network was lost but no real uncoherences have appeared?. Under such
asumption.. would it be ok to apply autorecover?.

Any light on all this?, any alternatives?, any experience under such

Thanks in advance,

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20120209/2cc815bd/attachment.htm>

More information about the drbd-user mailing list