[DRBD-user] Digest integrity check FAILED

Thiago Vinhas thiago at vinhas.org
Thu Jun 23 12:39:00 CEST 2011


Hi,

I'm testing a DRBD+MySQL environment in production, but after a while the
second node always gets disconnected, and I have no idea if it's a hardware
problem or missconfiguration.
The second node is not even mounted. I'm just replicating the data, not
using it.

The error is on the end of the message. Here is my conf:


resource r0 {
        meta-disk internal;
        device /dev/drbd0;
        disk /dev/sda4;

        syncer { rate 33M; }

        handlers {
        split-brain "/etc/init.d/mysql stop";
        }

        net {
                allow-two-primaries;
                after-sb-0pri discard-zero-changes;
                after-sb-1pri discard-secondary;
                after-sb-2pri disconnect;
                data-integrity-alg crc32c;
                ko-count 4;
        }

        startup { become-primary-on both; }

        on stewart { address 192.168.0.1:7789; }
        on prost { address 192.168.0.2:7789; }
}


Is there something wrong in my conf? Should I change something?
Another problem is that after the second node gets disconnected, I have to
reconnect it my hand my running "drbdadm connect r0". Aparently after
running it the nodes get quickly re-synced (less then a minute), and the
previously disconnected node starts as Secondary, so I had to run "drbdadm
primary r0".

Both nodes are Dell PowerEdge R710 with 48GB of ram, running RHEL 5.6 and
DRBD 8.3.10 (from ElRepo).

Am I missing something here?


Thanks for any help!

Regards,
Thiago Vinhas
block drbd0: Digest integrity check FAILED: 63266864s +4096
block drbd0: error receiving Data, l: 4136!
block drbd0: peer( Primary -> Unknown ) conn( Connected -> ProtocolError )
pdsk( UpToDate -> DUnknown )
block drbd0: new current UUID
66983E6BBEE733F5:6157ABDB87926AA5:0001000000000001:5905CD0F6B61A6A9
block drbd0: asender terminated
block drbd0: Terminating asender thread
block drbd0: Connection closed
block drbd0: conn( ProtocolError -> Unconnected )
block drbd0: receiver terminated
block drbd0: Restarting receiver thread
block drbd0: receiver (re)started
block drbd0: conn( Unconnected -> WFConnection )
block drbd0: Handshake successful: Agreed network protocol version 96
block drbd0: conn( WFConnection -> WFReportParams )
block drbd0: Starting asender thread (from drbd0_receiver [7794])
block drbd0: data-integrity-alg: md5
block drbd0: drbd_sync_handshake:
block drbd0: self
66983E6BBEE733F5:6157ABDB87926AA5:0001000000000001:5905CD0F6B61A6A9 bits:0
flags:0
block drbd0: peer
4C9FC71A2D13AF9F:6157ABDB87926AA5:0001000000000000:5905CD0F6B61A6A9 bits:40
flags:0
block drbd0: uuid_compare()=100 by rule 90
block drbd0: helper command: /sbin/drbdadm initial-split-brain minor-0
block drbd0: helper command: /sbin/drbdadm initial-split-brain minor-0 exit
code 0 (0x0)
block drbd0: Split-Brain detected but unresolved, dropping connection!
block drbd0: helper command: /sbin/drbdadm split-brain minor-0
block drbd0: meta connection shut down by peer.
block drbd0: conn( WFReportParams -> NetworkFailure )
block drbd0: asender terminated
block drbd0: Terminating asender thread
block drbd0: helper command: /sbin/drbdadm split-brain minor-0 exit code 0
(0x0)
block drbd0: conn( NetworkFailure -> Disconnecting )
block drbd0: error receiving ReportState, l: 4!
block drbd0: Connection closed
block drbd0: conn( Disconnecting -> StandAlone )
block drbd0: receiver terminated
block drbd0: Terminating receiver thread

Abs,
Thiago Vinhas
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20110623/a4df4c39/attachment.htm>


More information about the drbd-user mailing list