[DRBD-user] split brain issues

Cesar Peschiera brain at click.com.py
Fri Jan 23 02:30:43 CET 2015

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Hi Richard

I am using PVE + DRBD (primary-primary, protocol C) + HA (rgmanager) without 
problems since many years ago (With and without BBU in the RAID cards, 
always with the recommended tuning by the DRBD web portal).

Now i am using:
- 2 NICs exclusive for DRBD in bonding-rr (NIC-to-NIC, some pair of nodes 
with 10 Gb/s, and others with 1 Gb/s, always with 9000 mtu)
- Programs:
drbd-utils 8.9.(1+linbit-1)
drbd8-module-source (2:8.4.4-0)
- Kernel: 3.10
pve-kernel-3.10.0-5-pve (3.10.0-19)
- LVM on top of DRBD

Hardware:
For better performance, will be better that you enable in the Bios Hardware:
 - I/OAT DMA engine (in all the servers with Intel Processors, old or new, 
always works well with the NICs)
- Best performance (not save energy)
PVE (old and new versions) always works better with this Hardware 
configuration.

----- Original Message ----- 
From: "Lechner Richard" <r.lechner at gmx.net>
To: <drbd-user at lists.linbit.com>
Sent: Thursday, January 22, 2015 3:02 PM
Subject: Re: [DRBD-user] split brain issues


> Am Donnerstag 22 Januar 2015, 18:51:52 schrieb Yannis Milios:
>
> Hello Yannis,
>
>> Not sure what caused Inconsistent state on your drbd resource on Node1, 
>> but
>> my guess is that it experienced some kind of low level corruption on it's
>> backing storage (hard disks) and auto sync was initiated from Node2.
>
> Hard disc and md devices all are still running without any error.
> At the moment i run it as Primary/Secondary and status is Up2date/Up2date.
> So it looks like Eric's mention is right, pri/pri make some issues.
> But before it was ok so i think it must be one of the latest kernel 
> updates.
> Somthing changed in the drbd code.
>
>> Are you using linux raid? Then probably you don't have a battery backed
>> raid controller, so it would be wise to remove the following entries 
>> since
>> they could cause data loss in your case:
>>
>>                 no-disk-flushes;
>>                 no-md-flushes;
>>                 no-disk-barrier;
>
> Without this 3 options (and the right buffers) i get a realy bad 
> performance,
> it's a mailserver with a lot of small files and with drbd protocol C 
> sometimes
> it takes to long to respond for some client's.
> Biggest mailbox has 127 GB and 204.000 messages inside.
> Every time the mailserver build's a new mailbox index, u can imagine! :-(
>
> I do a lot of test's and it works fine with my config since Mar 2014, only 
> the
> last few weeks drbd start's hurting me. :-(
>
>> Finally from my experince proxmox does not work well with HA enabled 
>> (it's
>> not aware of underlying drbd resource) so it could cause frequent
>> split-brains to occur.Use DRBD without HA enabled on Proxmox , in
>> dual-primary mode (so you don't loose live migration capability on 
>> proxmox).
>
> At the moment i stoped ha and run it with pri/sec but a want to get back 
> my
> ha!
>
>> You can also create separate drbd resources for each proxmox mode so you
>> can better handle split brains.For example drbd0 -> drbdvg0 -> mounted
>> always on Node1 and drbd1 -> drbdvg1 -> mounted on Node2.
>> This way you will always know that vms running on Node1 are located on
>> drbd0 and vms running on Node2 are located on drbd1.
>
> Dont need this because i run only 1 VM in this cluster.
>
> Regards
>
> Richard
>
> _______________________________________________
> drbd-user mailing list
> drbd-user at lists.linbit.com
> http://lists.linbit.com/mailman/listinfo/drbd-user 




More information about the drbd-user mailing list