[DRBD-user] Strange number of 'Digest integrity check FAILED' after starting one agent

Lars Ellenberg lars.ellenberg at linbit.com
Sun Jul 3 11:14:12 CEST 2011

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


On Thu, May 26, 2011 at 08:06:39AM -0700, loopx wrote:
> 
> Hello, 
> 
> 
> We are running RHEL5 and DRBD version: 8.3.8 (api:88/proto:86-94) from
> CentOS Extra repository (no more update since the first install). Two
> servers are used :
> - Server_A (primary)
> - Server_B (secondary)
> 
> 
> Server_B is "inactive" regarding Server_A which is running applications (on
> "primary" side to access DRBD) but, because I'm using "Atlassian Bamboo"
> with commercial licence (and so, 1 remote agent free), I want to use the
> Bamboo Remote Agent on the Server_B (has no need to access to DRBD) to get
> better compilation performance/profitability.
> 
> 
> So, all "was" fine before starting the "Bamboo Remote Agent" ... "was"
> because I know that error "Digest integrity check FAILED" occur sometimes.
> 
> Before the Remote Agent :
> --------------------
> [1][root at Server_B ~]$ zcat /var/log/messages.4.gz /var/log/messages.3.gz |
> grep FAILED
> Apr 26 20:00:00 Server_B kernel: block drbd0: Digest integrity check FAILED.
> Apr 27 04:02:02 Server_B kernel: block drbd0: Digest integrity check FAILED.

Data changes between calculating, and verifying, the checksums.
That can be caused by several things. One of them being bad hardware.
One of them being "buffers modified in flight".

Read up about stable pages:
http://lwn.net/Articles/442355/
http://lwn.net/Articles/442355/

And, in one of the threads referenced from there,
you find an easy reproducer for "buffers modified in flight":
http://thread.gmane.org/gmane.linux.kernel/1103571/focus=51245

If your "agent" triggers a similar behaviour, well, that's it.

I did not yet try it myself, but from what I have seen being merged into
Linux 3.0, at least ext4 should have "stable pages" then.

So things are improving.

Meanwhile, all such integrity checking, whether "home grown" as in DRBD,
or "standard" like DIF/DIX, are only a tool to detect certain symptoms,
but do *not* imply any diagnosis without further context information
and analysis.

-- 
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com

DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
__
please don't Cc me, but send to list   --   I'm subscribed



More information about the drbd-user mailing list