[DRBD-user] DRBD + OCFS2 - Split-Brain detected but unresolved

David Coulson david at davidcoulson.net
Tue Apr 17 18:08:47 CEST 2012

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


http://www.drbd.org/users-guide/s-resolve-split-brain.html



On Apr 17, 2012, at 11:06 AM, Jacek Osiecki wrote:

> Hello,
> 
> I am currently testing dual-master setup with DRBD+OCFS2.
> Finally I managed to get it working well on kernel 2.6.39.4, DRBD version 8.3.10 (userland version: 8.4.1) and OCFS2 version 1.5.0.
> 
> I had some troubles with broken replication, and sometimes I see that
> automatic recovery sometimes works and sometimes does not. What's strange, is that this still are tests, and actually when one server is fully functional, second one has no processess that even touch the synchronized partition.
> 
> In dmesg on the active server it looks like this:
> 
> [707152.209885] block drbd0: Handshake successful: Agreed network protocol version 96
> [707152.209895] block drbd0: conn( WFConnection -> WFReportParams )
> [707152.210068] block drbd0: Starting asender thread (from drbd0_receiver [1096])
> [707152.210341] block drbd0: data-integrity-alg: <not-used>
> [707152.210352] block drbd0: max BIO size = 130560
> [707152.210359] block drbd0: drbd_sync_handshake:
> [707152.210363] block drbd0: self 8631CEC3370B5C31:A9BC3587FC5AA879:0BF8587A4ABA37B5:0BF7587A4ABA37B5 bits:21 flags:0
> [707152.210368] block drbd0: peer 73E08E8E06754D97:A9BC3587FC5AA879:0BF8587A4ABA37B5:0BF7587A4ABA37B5 bits:0 flags:0
> [707152.210371] block drbd0: uuid_compare()=100 by rule 90
> [707152.210377] block drbd0: helper command: /sbin/drbdadm initial-split-brain minor-0
> [707152.212439] block drbd0: helper command: /sbin/drbdadm initial-split-brain minor-0 exit code 0 (0x0)
> [707152.212442] block drbd0: Split-Brain detected but unresolved, dropping connection!
> [707152.212445] block drbd0: helper command: /sbin/drbdadm split-brain minor-0
> [707152.214134] block drbd0: helper command: /sbin/drbdadm split-brain minor-0 exit code 0 (0x0)
> [707152.214137] block drbd0: conn( WFReportParams -> Disconnecting )
> [707152.214141] block drbd0: error receiving ReportState, l: 4!
> [707152.214150] block drbd0: asender terminated
> [707152.214154] block drbd0: Terminating drbd0_asender
> [707152.214177] block drbd0: Connection closed
> [707152.214180] block drbd0: conn( Disconnecting -> StandAlone )
> [707152.214188] block drbd0: receiver terminated
> [707152.214190] block drbd0: Terminating drbd0_receiver
> 
> Is there any help for this situation? I don't understand why the case isn't solved, since second server doesn't write to drbd0, sometimes even partition wasn't mounted (I can't be 100% sure, but it seems so).
> 
> I would be greatful if you could give me some hint how to make this configuration stable, without sacrificing data on one of nodes (now in order to recover I have to set second node to slave). Any ideas what is wrong in my setup?
> 
> P.S. Any suggestions how to measure real performance (read/write/copy) of DRBD+OCFS2? UnixBench gives crazy results (read performance about 10% of local filesystem)...
> 
> Best regards,
> -- 
> Jacek Osiecki
> josiecki at silvercube.pl
> 
> Silvercube s.c.
> ul. Makuszynskiego 4
> 31-752 Kraków
> +48 (12) 684 21 00_______________________________________________
> drbd-user mailing list
> drbd-user at lists.linbit.com
> http://lists.linbit.com/mailman/listinfo/drbd-user




More information about the drbd-user mailing list