[DRBD-user] Digest integrity check FAILED

Lars Ellenberg lars.ellenberg at linbit.com
Fri Jun 24 19:27:51 CEST 2011

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


On Thu, Jun 23, 2011 at 07:39:00AM -0300, Thiago Vinhas wrote:
> Hi,
> 
> I'm testing a DRBD+MySQL environment in production, but after a while the
> second node always gets disconnected, and I have no idea if it's a hardware
> problem or missconfiguration.
> The second node is not even mounted. I'm just replicating the data, not
> using it.
> 
> The error is on the end of the message. Here is my conf:
> 
> 
> resource r0 {
>         meta-disk internal;
>         device /dev/drbd0;
>         disk /dev/sda4;
> 
>         syncer { rate 33M; }
> 
>         handlers {
>         split-brain "/etc/init.d/mysql stop";
>         }
> 
>         net {
>                 allow-two-primaries;

WHY?? You very likely do not want two primaries,
only you do not know it yet ;-)


>                 after-sb-0pri discard-zero-changes;
>                 after-sb-1pri discard-secondary;
>                 after-sb-2pri disconnect;
>                 data-integrity-alg crc32c;

Have you read

http://www.mail-archive.com/drbd-user@lists.linbit.com/msg03373.html


>                 ko-count 4;
>         }
> 
>         startup { become-primary-on both; }

Why??
You do not want that.
Really.
Most people trying to use "dual primary DRBD"
are really not needing it.

If you think you really want it, make sure that you understand,
and are able to deal with, the additional complexity it involves.

You realize of course that concurrent access with standard file systems
simply does not work, for that you need to use OCFS or GFS.

>         on stewart { address 192.168.0.1:7789; }
>         on prost { address 192.168.0.2:7789; }
> }
> 
> 
> Is there something wrong in my conf? Should I change something?
> Another problem is that after the second node gets disconnected, I have to
> reconnect it my hand my running "drbdadm connect r0". Aparently after
> running it the nodes get quickly re-synced (less then a minute), and the
> previously disconnected node starts as Secondary, so I had to run "drbdadm
> primary r0".
> 
> Both nodes are Dell PowerEdge R710 with 48GB of ram, running RHEL 5.6 and
> DRBD 8.3.10 (from ElRepo).
> 
> Am I missing something here?
> 
> 
> Thanks for any help!
> 
> Regards,
> Thiago Vinhas
> block drbd0: Digest integrity check FAILED: 63266864s +4096
> block drbd0: error receiving Data, l: 4136!
> block drbd0: peer( Primary -> Unknown ) conn( Connected -> ProtocolError )
> pdsk( UpToDate -> DUnknown )
> block drbd0: new current UUID
> 66983E6BBEE733F5:6157ABDB87926AA5:0001000000000001:5905CD0F6B61A6A9
> block drbd0: asender terminated
> block drbd0: Terminating asender thread
> block drbd0: Connection closed
> block drbd0: conn( ProtocolError -> Unconnected )
> block drbd0: receiver terminated
> block drbd0: Restarting receiver thread
> block drbd0: receiver (re)started
> block drbd0: conn( Unconnected -> WFConnection )
> block drbd0: Handshake successful: Agreed network protocol version 96
> block drbd0: conn( WFConnection -> WFReportParams )
> block drbd0: Starting asender thread (from drbd0_receiver [7794])
> block drbd0: data-integrity-alg: md5
> block drbd0: drbd_sync_handshake:
> block drbd0: self
> 66983E6BBEE733F5:6157ABDB87926AA5:0001000000000001:5905CD0F6B61A6A9 bits:0
> flags:0
> block drbd0: peer
> 4C9FC71A2D13AF9F:6157ABDB87926AA5:0001000000000000:5905CD0F6B61A6A9 bits:40
> flags:0
> block drbd0: uuid_compare()=100 by rule 90
> block drbd0: helper command: /sbin/drbdadm initial-split-brain minor-0
> block drbd0: helper command: /sbin/drbdadm initial-split-brain minor-0 exit
> code 0 (0x0)
> block drbd0: Split-Brain detected but unresolved, dropping connection!
> block drbd0: helper command: /sbin/drbdadm split-brain minor-0
> block drbd0: meta connection shut down by peer.
> block drbd0: conn( WFReportParams -> NetworkFailure )
> block drbd0: asender terminated
> block drbd0: Terminating asender thread
> block drbd0: helper command: /sbin/drbdadm split-brain minor-0 exit code 0
> (0x0)
> block drbd0: conn( NetworkFailure -> Disconnecting )
> block drbd0: error receiving ReportState, l: 4!
> block drbd0: Connection closed
> block drbd0: conn( Disconnecting -> StandAlone )
> block drbd0: receiver terminated
> block drbd0: Terminating receiver thread
> 
> Abs,
> Thiago Vinhas

> _______________________________________________
> drbd-user mailing list
> drbd-user at lists.linbit.com
> http://lists.linbit.com/mailman/listinfo/drbd-user


-- 
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com

DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
__
please don't Cc me, but send to list   --   I'm subscribed



More information about the drbd-user mailing list