[DRBD-user] Digest integrity check FAILED (when using MySql)

Martin Gombač martin at isg.si
Mon Jan 11 19:44:44 CET 2010

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Hi,

under heavy-to-mild mysql usage in Xen PV server hosted on drbd block
device, i get the Digest integrity check FAILED. dd or copy operations
seem not trigger it.
I've read posts about this issue before, so i'm aware that it's saving
my data and the usual ways to go about fixing it.
http://lists.linbit.com/pipermail/drbd-user/2008-January/008343.html
http://lists.linbit.com/pipermail/drbd-user/2009-February/011357.html

First thing i did was to remove all offloaded work from network card, to
the CPU ([rx on|off] [tx on|off] [sg on|off] [tso on|off] [ufo on|off]
[gso on|off] [gro on|off]) but it didn't help. Then i switched to second
network card, intel server type and it still didn't work. Offloading or
not, it didn't matter. I tried two more different network cards, just to
be sure. These were the cards:
# Broadcom Corporation NetXtreme BCM5721 Gigabit Ethernet PCI Express
# Intel Corporation 82574L Gigabit Network Connection
# Intel Corporation 82572EI Gigabit Ethernet Controller (Copper)
# Realtek Semiconductor Co., Ltd. RTL-8139/8139C/8139C+

The system is latest stable CentOS (5.3) running Xen. Drbd is from
extras repo.
Linux ibm1 2.6.18-164.el5xen #1 SMP Thu Sep 3 04:03:03 EDT 2009 x86_64
x86_64 x86_64 GNU/Linux
version: 8.3.2 (api:88/proto:86-90)

I've also read Larses text:

>>
if it is an option for you,
could you try a kernel.org 2.6.28.2 kernel?

there have been some changes
in the generic dirty page write out path
which affect data integrity.

if I understand those changes correctly,
apparently all kernels up to that one
may return early on fsync and similar operations.
<<

And would like to know if the bug he's talking about is critical for
data consistency or just messes up the verification process?
I would also appreciate any idea on how to deal with this situation.
Thank you.

Some more info. Both servers are identical:
00:00.0 Host bridge: Intel Corporation 5000P Chipset Memory Controller
Hub (rev b1)
00:02.0 PCI bridge: Intel Corporation 5000 Series Chipset PCI Express x8
Port 2-3 (rev b1)
00:03.0 PCI bridge: Intel Corporation 5000 Series Chipset PCI Express x4
Port 3 (rev b1)
00:04.0 PCI bridge: Intel Corporation 5000 Series Chipset PCI Express x8
Port 4-5 (rev b1)
00:05.0 PCI bridge: Intel Corporation 5000 Series Chipset PCI Express x4
Port 5 (rev b1)
00:06.0 PCI bridge: Intel Corporation 5000 Series Chipset PCI Express x8
Port 6-7 (rev b1)
00:07.0 PCI bridge: Intel Corporation 5000 Series Chipset PCI Express x4
Port 7 (rev b1)
00:10.0 Host bridge: Intel Corporation 5000 Series Chipset FSB Registers
(rev b1)
00:10.1 Host bridge: Intel Corporation 5000 Series Chipset FSB Registers
(rev b1)
00:10.2 Host bridge: Intel Corporation 5000 Series Chipset FSB Registers
(rev b1)
00:11.0 Host bridge: Intel Corporation 5000 Series Chipset Reserved
Registers (rev b1)
00:13.0 Host bridge: Intel Corporation 5000 Series Chipset Reserved
Registers (rev b1)
00:15.0 Host bridge: Intel Corporation 5000 Series Chipset FBD Registers
(rev b1)
00:16.0 Host bridge: Intel Corporation 5000 Series Chipset FBD Registers
(rev b1)
00:1c.0 PCI bridge: Intel Corporation 631xESB/632xESB/3100 Chipset PCI
Express Root Port 1 (rev 09)
00:1c.1 PCI bridge: Intel Corporation 631xESB/632xESB/3100 Chipset PCI
Express Root Port 2 (rev 09)
00:1d.0 USB Controller: Intel Corporation 631xESB/632xESB/3100 Chipset
UHCI USB Controller #1 (rev 09)
00:1d.1 USB Controller: Intel Corporation 631xESB/632xESB/3100 Chipset
UHCI USB Controller #2 (rev 09)
00:1d.2 USB Controller: Intel Corporation 631xESB/632xESB/3100 Chipset
UHCI USB Controller #3 (rev 09)
00:1d.3 USB Controller: Intel Corporation 631xESB/632xESB/3100 Chipset
UHCI USB Controller #4 (rev 09)
00:1d.7 USB Controller: Intel Corporation 631xESB/632xESB/3100 Chipset
EHCI USB2 Controller (rev 09)
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev d9)
00:1f.0 ISA bridge: Intel Corporation 631xESB/632xESB/3100 Chipset LPC
Interface Controller (rev 09)
00:1f.1 IDE interface: Intel Corporation 631xESB/632xESB IDE Controller
(rev 09)
00:1f.2 SATA controller: Intel Corporation 631xESB/632xESB SATA AHCI
Controller (rev 09)
00:1f.3 SMBus: Intel Corporation 631xESB/632xESB/3100 Chipset SMBus
Controller (rev 09)
01:00.0 PCI bridge: Intel Corporation 6311ESB/6321ESB PCI Express
Upstream Port (rev 01)
01:00.3 PCI bridge: Intel Corporation 6311ESB/6321ESB PCI Express to
PCI-X Bridge (rev 01)
02:00.0 PCI bridge: Intel Corporation 6311ESB/6321ESB PCI Express
Downstream Port E1 (rev 01)
02:01.0 PCI bridge: Intel Corporation 6311ESB/6321ESB PCI Express
Downstream Port E2 (rev 01)
03:00.0 RAID bus controller: Adaptec AAC-RAID (Rocket) (rev 02)
15:00.0 Ethernet controller: Intel Corporation 82574L Gigabit Network
Connection
1a:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5721
Gigabit Ethernet PCI Express (rev 21)
1c:01.0 Ethernet controller: Realtek Semiconductor Co., Ltd.
RTL-8139/8139C/8139C+ (rev 10)
1c:04.0 VGA compatible controller: ATI Technologies Inc ES1000 (rev 02)


[root at ibm2 etc]# tail -f /var/log/messages | grep FAIL
Jan 11 19:03:19 ibm2 kernel: block drbd0: Digest integrity check FAILED.
Jan 11 19:03:21 ibm2 pengine: [16979]: notice: native_print:
drbd_r0:1    (ocf::linbit:drbd):    Master ibm2 FAILED
Jan 11 19:03:21 ibm2 pengine: [16979]: notice: native_print:
drbd_r0:1    (ocf::linbit:drbd):    Master ibm2 FAILED
Jan 11 19:03:24 ibm2 pengine: [16979]: notice: native_print:
drbd_r0:1    (ocf::linbit:drbd):    Slave ibm2 FAILED
Jan 11 19:03:59 ibm2 kernel: block drbd0: Digest integrity check FAILED.
Jan 11 19:04:02 ibm2 pengine: [16979]: notice: native_print:
drbd_r0:1    (ocf::linbit:drbd):    Master ibm2 FAILED
Jan 11 19:04:02 ibm2 pengine: [16979]: notice: native_print:
drbd_r0:1    (ocf::linbit:drbd):    Master ibm2 FAILED
Jan 11 19:04:02 ibm2 pengine: [16979]: notice: native_print:
drbd_r0:1    (ocf::linbit:drbd):    Master ibm2 FAILED
Jan 11 19:04:34 ibm2 kernel: block drbd0: Digest integrity check FAILED.
Jan 11 19:04:45 ibm2 pengine: [16979]: notice: native_print:
drbd_r0:1    (ocf::linbit:drbd):    Master ibm2 FAILED
Jan 11 19:04:45 ibm2 pengine: [16979]: notice: native_print:
drbd_r0:1    (ocf::linbit:drbd):    Master ibm2 FAILED

Regards,
M.




More information about the drbd-user mailing list