Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
On Tue, 17 Apr 2012 18:31:09 +0200, Felix Frank wrote:
> On 04/17/2012 05:06 PM, Jacek Osiecki wrote:
>> automatic recovery sometimes works and sometimes does
>> not.
> we seem to be lacking your drbd config.
Right, my bad :)
> How is automatic split brain recovery configured?
Probably it isn't - here's the config:
global {usage-count yes;}
common {
handlers {
pri-on-incon-degr "/usr/lib/drbd/notify-pri-on-incon-degr.sh;
/usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ;
reboot -f";
pri-lost-after-sb "/usr/lib/drbd/notify-pri-lost-after-sb.sh;
/usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ;
reboot -f";
local-io-error "/usr/lib/drbd/notify-io-error.sh;
/usr/lib/drbd/notify-emergency-shutdown.sh; echo o > /proc/sysrq-trigger
; halt -f";
}
disk { on-io-error detach; }
syncer {rate 100M;}
}
and the resource config:
resource home
{
protocol C;
meta-disk internal;
device /dev/drbd0;
disk /dev/md4;
net {
allow-two-primaries;
after-sb-0pri discard-zero-changes;
after-sb-1pri discard-secondary;
after-sb-2pri disconnect;
}
startup { become-primary-on both; }
on mike { address 176.xx.xx.xx:7789; }
on november { address 176.yy.yy.yy:7789; }
}
> I get the feeling it's not. What split-brain situations have you
> perceived as being automatically solved?
Something like this:
[287856.619503] block drbd0: Handshake successful: Agreed network
protocol version 96
[287856.619512] block drbd0: conn( WFConnection -> WFReportParams )
[287856.619682] block drbd0: Starting asender thread (from
drbd0_receiver [24712])
[287856.619885] block drbd0: data-integrity-alg: <not-used>
[287856.619967] block drbd0: max BIO size = 130560
[287856.619978] block drbd0: drbd_sync_handshake:
[287856.619982] block drbd0: self
18D97D7348BC1031:232CE4A32F2915DB:B873B3F48F57A893:B872B3F48F57A893
bits:50 flags:0
[287856.619987] block drbd0: peer
8359D2DF4D7761E0:232CE4A32F2915DB:B873B3F48F57A893:B872B3F48F57A893
bits:3072 flags:2
[287856.619992] block drbd0: uuid_compare()=100 by rule 90
[287856.619995] block drbd0: helper command: /sbin/drbdadm
initial-split-brain minor-0
[287856.622133] block drbd0: helper command: /sbin/drbdadm
initial-split-brain minor-0 exit code 0 (0x0)
[287856.622136] block drbd0: Split-Brain detected, 1 primaries,
automatically solved. Sync from this node
[287856.622141] block drbd0: peer( Unknown -> Secondary ) conn(
WFReportParams -> WFBitMapS ) pdsk( DUnknown -> Consistent )
[287856.639285] block drbd0: peer( Secondary -> Primary )
[287856.986857] block drbd0: helper command: /sbin/drbdadm
before-resync-source minor-0
[287856.988873] block drbd0: helper command: /sbin/drbdadm
before-resync-source minor-0 exit code 0 (0x0)
[287856.988879] block drbd0: conn( WFBitMapS -> SyncSource ) pdsk(
Consistent -> Inconsistent )
[287856.988884] block drbd0: Began resync as SyncSource (will sync
12484 KB [3121 bits set]).
[287856.988895] block drbd0: updated sync UUID
18D97D7348BC1031:232DE4A32F2915DB:232CE4A32F2915DB:B873B3F48F57A893
[287857.202264] block drbd0: Resync done (total 1 sec; paused 0 sec;
12484 K/sec)
[287857.202268] block drbd0: updated UUIDs
18D97D7348BC1031:0000000000000000:232DE4A32F2915DB:232CE4A32F2915DB
[287857.202272] block drbd0: conn( SyncSource -> Connected ) pdsk(
Inconsistent -> UpToDate )
[287857.347396] block drbd0: bitmap WRITE of 4793 pages took 29 jiffies
[287857.419057] block drbd0: 0 KB (0 bits) marked out-of-sync by on
disk bit-map.
but now I see that those were probably split-brains after secondary
node being rebooted
when I've been testing a lot automatic set-up of drbd after reboot. Am
I right?
>> 73E08E8E06754D97:A9BC3587FC5AA879:0BF8587A4ABA37B5:0BF7587A4ABA37B5
>> bits:0 flags:0
> This looks fine - the peer has set 0 bits, so it's probably indeed
> unchanged.
>> why the case isn't solved, since second server doesn't write to
>> drbd0,
>> sometimes even partition wasn't mounted (I can't be 100% sure, but
>> it
>> seems so).
> A policy of discard-zero-changes could solve this for you, but only
> if
> configured thus.
Seems that my config is lacking this.
My planis to use DRBD+OCFS2 for a HA configuration, with two machines
behind
hardware load-balancer. So far I've been modifying filesystem on one
machine
only. I'm wondering how to handle the situation, where nodes can't see
each
other but are still available through the internet (that's possible,
for
distant locations. Are there any mechanisms that would be capable of
synchronizing the nodes (when node-node communication is up again) on
filesystem
level? I mean, that sometimes even though both filesystems are modified
-
the changes don't cause any conflicts...
Is anyone using such a configuration? What policies are you using?
>> P.S. Any suggestions how to measure real performance
>> (read/write/copy)
>> of DRBD+OCFS2? UnixBench gives crazy results (read performance about
>> 10%
>> of local filesystem)...
> Is this crazy? I wouldn't know. But bear in mind that stat can be an
> expensive operation on a cluster file system vs. a regular old fs.
Here are the results from UnixBench, where I compared:
- local ext3 filesystem
- drbd+ocfs2 in master-master cluster :)
- NFS from NAS provided by OVH hosting
Results in KBps, for copy/read/write. I even didn't dig the exact
meaning
of UnixBench parameters or its methodology, rather wanted to compare
raw
values in similar circumstances:
+-----------------------+-----------+----------------+------------------+
|X bufsize,Y maxblocks |ext3(local)| (drbd+ocfs2) | NFS (ovh-nas)
|
+-----------------------+-----------+----------------+------------------+
| CP 1024 buf 2000 mxbl | 1001513.5| 329691.5 (33%)| 8439.9
(0.8%)|
| CP 256 buf 500 mxbl | 289354.4| 83344.5 (29%)| 7545.5
(2.6%)|
| RD 1024 buf 2000 mxbl | 16683047.3| 1627301.6 (10%)| 16026036.4 (
96%)|
| RD 256 buf 500 mxbl | 4737836.5| 413126.7 ( 9%)| 4509106.6 (
95%)|
| RD 4096 buf 8000 mxbl | 35705631.9| 6872806.6 (19%)| 34967996.7 (
97%)|
| WR 256 buf 500 mxbl | 315172.2| 87545.4 (28%)| 8711.3
(2.8%)|
| WR 4096 buf 8000 mxbl | 3522086.9| 1290255.5 (37%)| 10991.6
(0.3%)|
+-----------------------+-----------+----------------+------------------+
I wrote "crazy" since 10% seems to be quite a low value, especially
when
comparing to copy/write, which seem to be running at 33% of local fs
speed.
Now I realize, that read speed is still much higher than write/copy
speed.
However - could someone verify those values? I just realized that
UnixBench
results are hard to believe and seem to be muuch to high :)
Greetings,
--
Jacek Osiecki