Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
On Tue, 17 Apr 2012 18:31:09 +0200, Felix Frank wrote: > On 04/17/2012 05:06 PM, Jacek Osiecki wrote: >> automatic recovery sometimes works and sometimes does >> not. > we seem to be lacking your drbd config. Right, my bad :) > How is automatic split brain recovery configured? Probably it isn't - here's the config: global {usage-count yes;} common { handlers { pri-on-incon-degr "/usr/lib/drbd/notify-pri-on-incon-degr.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f"; pri-lost-after-sb "/usr/lib/drbd/notify-pri-lost-after-sb.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f"; local-io-error "/usr/lib/drbd/notify-io-error.sh; /usr/lib/drbd/notify-emergency-shutdown.sh; echo o > /proc/sysrq-trigger ; halt -f"; } disk { on-io-error detach; } syncer {rate 100M;} } and the resource config: resource home { protocol C; meta-disk internal; device /dev/drbd0; disk /dev/md4; net { allow-two-primaries; after-sb-0pri discard-zero-changes; after-sb-1pri discard-secondary; after-sb-2pri disconnect; } startup { become-primary-on both; } on mike { address 176.xx.xx.xx:7789; } on november { address 176.yy.yy.yy:7789; } } > I get the feeling it's not. What split-brain situations have you > perceived as being automatically solved? Something like this: [287856.619503] block drbd0: Handshake successful: Agreed network protocol version 96 [287856.619512] block drbd0: conn( WFConnection -> WFReportParams ) [287856.619682] block drbd0: Starting asender thread (from drbd0_receiver [24712]) [287856.619885] block drbd0: data-integrity-alg: <not-used> [287856.619967] block drbd0: max BIO size = 130560 [287856.619978] block drbd0: drbd_sync_handshake: [287856.619982] block drbd0: self 18D97D7348BC1031:232CE4A32F2915DB:B873B3F48F57A893:B872B3F48F57A893 bits:50 flags:0 [287856.619987] block drbd0: peer 8359D2DF4D7761E0:232CE4A32F2915DB:B873B3F48F57A893:B872B3F48F57A893 bits:3072 flags:2 [287856.619992] block drbd0: uuid_compare()=100 by rule 90 [287856.619995] block drbd0: helper command: /sbin/drbdadm initial-split-brain minor-0 [287856.622133] block drbd0: helper command: /sbin/drbdadm initial-split-brain minor-0 exit code 0 (0x0) [287856.622136] block drbd0: Split-Brain detected, 1 primaries, automatically solved. Sync from this node [287856.622141] block drbd0: peer( Unknown -> Secondary ) conn( WFReportParams -> WFBitMapS ) pdsk( DUnknown -> Consistent ) [287856.639285] block drbd0: peer( Secondary -> Primary ) [287856.986857] block drbd0: helper command: /sbin/drbdadm before-resync-source minor-0 [287856.988873] block drbd0: helper command: /sbin/drbdadm before-resync-source minor-0 exit code 0 (0x0) [287856.988879] block drbd0: conn( WFBitMapS -> SyncSource ) pdsk( Consistent -> Inconsistent ) [287856.988884] block drbd0: Began resync as SyncSource (will sync 12484 KB [3121 bits set]). [287856.988895] block drbd0: updated sync UUID 18D97D7348BC1031:232DE4A32F2915DB:232CE4A32F2915DB:B873B3F48F57A893 [287857.202264] block drbd0: Resync done (total 1 sec; paused 0 sec; 12484 K/sec) [287857.202268] block drbd0: updated UUIDs 18D97D7348BC1031:0000000000000000:232DE4A32F2915DB:232CE4A32F2915DB [287857.202272] block drbd0: conn( SyncSource -> Connected ) pdsk( Inconsistent -> UpToDate ) [287857.347396] block drbd0: bitmap WRITE of 4793 pages took 29 jiffies [287857.419057] block drbd0: 0 KB (0 bits) marked out-of-sync by on disk bit-map. but now I see that those were probably split-brains after secondary node being rebooted when I've been testing a lot automatic set-up of drbd after reboot. Am I right? >> 73E08E8E06754D97:A9BC3587FC5AA879:0BF8587A4ABA37B5:0BF7587A4ABA37B5 >> bits:0 flags:0 > This looks fine - the peer has set 0 bits, so it's probably indeed > unchanged. >> why the case isn't solved, since second server doesn't write to >> drbd0, >> sometimes even partition wasn't mounted (I can't be 100% sure, but >> it >> seems so). > A policy of discard-zero-changes could solve this for you, but only > if > configured thus. Seems that my config is lacking this. My planis to use DRBD+OCFS2 for a HA configuration, with two machines behind hardware load-balancer. So far I've been modifying filesystem on one machine only. I'm wondering how to handle the situation, where nodes can't see each other but are still available through the internet (that's possible, for distant locations. Are there any mechanisms that would be capable of synchronizing the nodes (when node-node communication is up again) on filesystem level? I mean, that sometimes even though both filesystems are modified - the changes don't cause any conflicts... Is anyone using such a configuration? What policies are you using? >> P.S. Any suggestions how to measure real performance >> (read/write/copy) >> of DRBD+OCFS2? UnixBench gives crazy results (read performance about >> 10% >> of local filesystem)... > Is this crazy? I wouldn't know. But bear in mind that stat can be an > expensive operation on a cluster file system vs. a regular old fs. Here are the results from UnixBench, where I compared: - local ext3 filesystem - drbd+ocfs2 in master-master cluster :) - NFS from NAS provided by OVH hosting Results in KBps, for copy/read/write. I even didn't dig the exact meaning of UnixBench parameters or its methodology, rather wanted to compare raw values in similar circumstances: +-----------------------+-----------+----------------+------------------+ |X bufsize,Y maxblocks |ext3(local)| (drbd+ocfs2) | NFS (ovh-nas) | +-----------------------+-----------+----------------+------------------+ | CP 1024 buf 2000 mxbl | 1001513.5| 329691.5 (33%)| 8439.9 (0.8%)| | CP 256 buf 500 mxbl | 289354.4| 83344.5 (29%)| 7545.5 (2.6%)| | RD 1024 buf 2000 mxbl | 16683047.3| 1627301.6 (10%)| 16026036.4 ( 96%)| | RD 256 buf 500 mxbl | 4737836.5| 413126.7 ( 9%)| 4509106.6 ( 95%)| | RD 4096 buf 8000 mxbl | 35705631.9| 6872806.6 (19%)| 34967996.7 ( 97%)| | WR 256 buf 500 mxbl | 315172.2| 87545.4 (28%)| 8711.3 (2.8%)| | WR 4096 buf 8000 mxbl | 3522086.9| 1290255.5 (37%)| 10991.6 (0.3%)| +-----------------------+-----------+----------------+------------------+ I wrote "crazy" since 10% seems to be quite a low value, especially when comparing to copy/write, which seem to be running at 33% of local fs speed. Now I realize, that read speed is still much higher than write/copy speed. However - could someone verify those values? I just realized that UnixBench results are hard to believe and seem to be muuch to high :) Greetings, -- Jacek Osiecki