[DRBD-user] DRBD - one half of Proxmox cluster miscommunicating

Dirk Bonenkamp - ProActive dirk at proactive.nl
Tue Jul 31 10:57:47 CEST 2012

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Op 31-7-2012 10:51, JAMES GIBBON schreef:
> On Tue, 31 Jul 2012 09:32:48 +0200
> Felix Frank <ff at mpexnet.de <mailto:ff at mpexnet.de>> wrote:
>
> >
> > Judging from your log excerpt, there might be a connectivity
> > issue, but this could very well be a pure split brain that
> > needs resolving. See
> > http://www.drbd.org/users-guide/s-resolve-split-brain.html and
> > note that you will likely loose whatever has been written to
> > your "troubled" node. You may want to copy precious data if any
> > has been written.
> >
> > What we'd need to see is your drbd configuration. Also the
> > connection states of both nodes' respective NICs. Finally: Have
> > you tried just issuing "drbdadm connect all" on the second node?
> >
>
> Hi Felix,
>
> Many thanks for your reply.
>
> At the moment, no virtual machines are running on the second,
> "troubled" server. I cannot afford to lose data from the master's
> view of the storage, which is exactly as I would wish it, but
> the slave isn't doing anything - so if they are out of sync, I'm
> happy for the slave to adopt the same view of the world as the
> master.
>
> Aaaaaahhh.. I can see what I've done. Thanks for asking about the
> NICs. The IP address connecting to the storage on the master is
> 10.0.1.1. The IP on the slave should be 10.0.1.2, but in fact at
> the moment it is also 10.0.1.1. This is because when I changed
> the server public IP addresses I copied over the interfaces file
> from the master to the slave. I edited the public IP address, but
> not the private IP address used to communicate with the NAS units.
>
> So it looks like it's my fault.
>
> OK. So my question now is - how can I fix this, without losing
> data from the master? Tempting though it is to simply correct the
> file and reboot, I think it better to solicit a more experienced
> opinion first.
>
> Another consideration is that if resyncing the disks will consume
> a lot of resources on the master, I'll need that to happen out of
> hours to avoid impacting production systems running on it.
>
> Thanks again, and I'd be very grateful for some help. My main
> concern is that I can't afford for the master's version to be disrupted.
>

Since the problem is on the slave, I wouldn't worry to much. Fix the IP
on the slave and issue the 'drdbadm connect all' on the slave. If it has
been the slave all the time, it should connect and synchronise without
complaining.

And... just to be sure: Make a backup of the data on the master node...

Cheers,

Dirk





-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20120731/528f3117/attachment.htm>


More information about the drbd-user mailing list