[DRBD-user] recovering from "Local IO failed. Detaching..."

Gianluca Cecchi gianluca.cecchi at gmail.com
Thu Sep 10 18:28:23 CEST 2009

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


On Thu, Sep 10, 2009 at 6:20 PM, Lars Ellenberg
<lars.ellenberg at linbit.com>wrote:

> [snip]
>
> I've seen similar symptoms before, and it could be worked around by
> disabling offloading settings on the NICs used for the replication ;)
> I know, that interaction sounds a bit far-fetched, but those are the
> facts.
>
> # to view offload settings
> ethtool -k eth7
> # to switch them all off:
> ethtool -K eth7 rx off tx off sg off tso off
>
>
>
[root at virtfedbis ~]# ethtool -k eth3
Offload parameters for eth3:
Cannot get device flags: Operation not supported
rx-checksumming: on
tx-checksumming: on
scatter-gather: on
tcp-segmentation-offload: on
udp-fragmentation-offload: off
generic-segmentation-offload: on
generic-receive-offload: off
large-receive-offload: off

[root at virtfedbis ~]# ethtool -K eth3 rx off tx off sg off tso off

[root at virtfedbis ~]# ethtool -k eth3
Offload parameters for eth3:
Cannot get device flags: Operation not supported
rx-checksumming: off
tx-checksumming: off
scatter-gather: off
tcp-segmentation-offload: off
udp-fragmentation-offload: off
generic-segmentation-offload: on
generic-receive-offload: off
large-receive-offload: off

If I try the attach without doing the same settings on other peer eth3 I
get:

Sep 10 18:24:34 virtfedbis kernel: block drbd0: disk( Diskless -> Attaching
)
Sep 10 18:24:34 virtfedbis kernel: block drbd0: Found 6 transactions (244
active extents) in activity log.
Sep 10 18:24:34 virtfedbis kernel: block drbd0: Method to ensure write
ordering: barrier
Sep 10 18:24:34 virtfedbis kernel: block drbd0: max_segment_size ( = BIO
size ) = 32768
Sep 10 18:24:34 virtfedbis kernel: block drbd0: recounting of set bits took
additional 1 jiffies
Sep 10 18:24:34 virtfedbis kernel: block drbd0: 920 MB (235520 bits) marked
out-of-sync by on disk bit-map.
Sep 10 18:24:34 virtfedbis kernel: block drbd0: Marked additional 0 KB as
out-of-sync based on AL.
Sep 10 18:24:34 virtfedbis kernel: end_request: I/O error, dev cciss/c0d0,
sector 0
Sep 10 18:24:34 virtfedbis kernel: block drbd0: meta data flush failed with
status -95, disabling md-flushes
Sep 10 18:24:34 virtfedbis kernel: block drbd0: disk( Attaching ->
Negotiating )
Sep 10 18:24:34 virtfedbis kernel: block drbd0: drbd_sync_handshake:
Sep 10 18:24:34 virtfedbis kernel: block drbd0: self
D5C42445B9F5C227:0000000000000000:0DB564243F5AA9A3:377245292BBD1112
bits:235520 flags:0
Sep 10 18:24:34 virtfedbis kernel: block drbd0: peer
A0332E51B243BEE1:D5C42445B9F5C227:FFEDAA5E725D8157:13925DF660B57F5D
bits:309189 flags:0
Sep 10 18:24:34 virtfedbis kernel: block drbd0: uuid_compare()=-1 by rule 50
Sep 10 18:24:34 virtfedbis kernel: block drbd0: Becoming sync target due to
disk states.
Sep 10 18:24:34 virtfedbis kernel: block drbd0: conn( Connected -> WFBitMapT
) disk( Negotiating -> Outdated )
Sep 10 18:24:34 virtfedbis kernel: block drbd0: conn( WFBitMapT ->
WFSyncUUID )
Sep 10 18:24:34 virtfedbis kernel: block drbd0: helper command:
/sbin/drbdadm before-resync-target minor-0
Sep 10 18:24:34 virtfedbis kernel: block drbd0: helper command:
/sbin/drbdadm before-resync-target minor-0 exit code 0 (0x0)
Sep 10 18:24:34 virtfedbis kernel: block drbd0: conn( WFSyncUUID ->
SyncTarget ) disk( Outdated -> Inconsistent )
Sep 10 18:24:34 virtfedbis kernel: block drbd0: Began resync as SyncTarget
(will sync 1236756 KB [309189 bits set]).
Sep 10 18:24:34 virtfedbis kernel: block drbd0: Resync aborted.
Sep 10 18:24:34 virtfedbis kernel: block drbd0: conn( SyncTarget ->
Connected ) disk( Inconsistent -> Failed )
Sep 10 18:24:34 virtfedbis kernel: block drbd0: Local IO failed.
Detaching...
Sep 10 18:24:34 virtfedbis kernel: block drbd0: disk( Failed -> Diskless )
Sep 10 18:24:34 virtfedbis kernel: block drbd0: Notified peer that my disk
is broken.

Even after setting same on other peer I get:

Sep 10 18:26:06 virtfedbis kernel: block drbd0: disk( Diskless -> Attaching
)
Sep 10 18:26:06 virtfedbis kernel: block drbd0: Found 6 transactions (244
active extents) in activity log.
Sep 10 18:26:06 virtfedbis kernel: block drbd0: Method to ensure write
ordering: barrier
Sep 10 18:26:06 virtfedbis kernel: block drbd0: max_segment_size ( = BIO
size ) = 32768
Sep 10 18:26:06 virtfedbis kernel: block drbd0: recounting of set bits took
additional 1 jiffies
Sep 10 18:26:06 virtfedbis kernel: block drbd0: 920 MB (235520 bits) marked
out-of-sync by on disk bit-map.
Sep 10 18:26:06 virtfedbis kernel: block drbd0: Marked additional 0 KB as
out-of-sync based on AL.
Sep 10 18:26:06 virtfedbis kernel: end_request: I/O error, dev cciss/c0d0,
sector 0
Sep 10 18:26:06 virtfedbis kernel: block drbd0: meta data flush failed with
status -95, disabling md-flushes
Sep 10 18:26:06 virtfedbis kernel: block drbd0: disk( Attaching ->
Negotiating )
Sep 10 18:26:06 virtfedbis kernel: block drbd0: drbd_sync_handshake:
Sep 10 18:26:06 virtfedbis kernel: block drbd0: self
FAFACA8496A4ED9D:0000000000000000:0DB564243F5AA9A3:377245292BBD1112
bits:235520 flags:0
Sep 10 18:26:06 virtfedbis kernel: block drbd0: peer
A0332E51B243BEE1:FAFACA8496A4ED9D:D5C42445B9F5C227:FFEDAA5E725D8157
bits:310129 flags:0
Sep 10 18:26:06 virtfedbis kernel: block drbd0: uuid_compare()=-1 by rule 50
Sep 10 18:26:06 virtfedbis kernel: block drbd0: Becoming sync target due to
disk states.
Sep 10 18:26:06 virtfedbis kernel: block drbd0: conn( Connected -> WFBitMapT
) disk( Negotiating -> Outdated )
Sep 10 18:26:06 virtfedbis kernel: block drbd0: conn( WFBitMapT ->
WFSyncUUID )
Sep 10 18:26:06 virtfedbis kernel: block drbd0: helper command:
/sbin/drbdadm before-resync-target minor-0
Sep 10 18:26:06 virtfedbis kernel: block drbd0: helper command:
/sbin/drbdadm before-resync-target minor-0 exit code 0 (0x0)
Sep 10 18:26:06 virtfedbis kernel: block drbd0: conn( WFSyncUUID ->
SyncTarget ) disk( Outdated -> Inconsistent )
Sep 10 18:26:06 virtfedbis kernel: block drbd0: Began resync as SyncTarget
(will sync 1240516 KB [310129 bits set]).
Sep 10 18:26:07 virtfedbis kernel: block drbd0: Resync aborted.
Sep 10 18:26:07 virtfedbis kernel: block drbd0: conn( SyncTarget ->
Connected ) disk( Inconsistent -> Failed )
Sep 10 18:26:07 virtfedbis kernel: block drbd0: Local IO failed.
Detaching...
Sep 10 18:26:07 virtfedbis kernel: block drbd0: 1121 messages suppressed in
/root/drbd-8.3.3rc1/dist/BUILD/drbd-8.3.3rc1/drbd/drbd_receiver.c:1573.
Sep 10 18:26:07 virtfedbis kernel: block drbd0: Can not write resync data to
local disk.
Sep 10 18:26:07 virtfedbis kernel: block drbd0: drbd_rs_complete_io()
called, but extent not found
Sep 10 18:26:07 virtfedbis kernel: block drbd0: drbd_rs_complete_io()
called, but extent not found
Sep 10 18:26:07 virtfedbis kernel: block drbd0: drbd_rs_complete_io()
called, but extent not found
Sep 10 18:26:07 virtfedbis kernel: block drbd0: Can not write resync data to
local disk.
Sep 10 18:26:07 virtfedbis kernel: block drbd0: drbd_rs_complete_io()
called, but extent not found
Sep 10 18:26:07 virtfedbis kernel: block drbd0: drbd_rs_complete_io()
called, but extent not found
Sep 10 18:26:07 virtfedbis kernel: block drbd0: drbd_rs_complete_io()
called, but extent not found
Sep 10 18:26:07 virtfedbis kernel: block drbd0: drbd_rs_complete_io()
called, but extent not found
Sep 10 18:26:07 virtfedbis kernel: block drbd0: drbd_rs_complete_io()
called, but extent not found
Sep 10 18:26:07 virtfedbis kernel: block drbd0: drbd_rs_complete_io()
called, but extent not found
Sep 10 18:26:07 virtfedbis kernel: block drbd0: Can not write resync data to
local disk.
Sep 10 18:26:07 virtfedbis kernel: block drbd0: Can not write resync data to
local disk.
Sep 10 18:26:07 virtfedbis kernel: block drbd0: drbd_rs_complete_io()
called, but extent not found
Sep 10 18:26:07 virtfedbis kernel: block drbd0: drbd_rs_complete_io()
called, but extent not found
Sep 10 18:26:07 virtfedbis kernel: block drbd0: disk( Failed -> Diskless )
Sep 10 18:26:07 virtfedbis kernel: block drbd0: Notified peer that my disk
is broken.
Sep 10 18:26:07 virtfedbis kernel: block drbd0: Can not write resync data to
local disk.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20090910/cf11f46d/attachment.htm>


More information about the drbd-user mailing list