[DRBD-user] DRBD Trouble (block drbd0: local WRITE IO error sector)

Mon Feb 20 06:01:54 CET 2017

Hello Lars,
Thank you for a polite answer.

As you can see below, we found that this event can be avoided by
invalidating the WRITE SAME command.

1) echo 0 | tee
/sys/block/sdc/device/../0:0:2:0/scsi_disk/0:0:2:0/max_write_same_blocks
Change max_write_same_block from 65535 to 0. (We use sdc as DRBD area)
2) vgchange -an; vgchange
Because it uses LVM, it reflects the above change. (Is it OK if I run
it while running? Please let me know if you know)

Also, do you worry about performance degradation by disabling WRITE SAME?

Thank you.

2017-02-17 22:14 GMT+09:00 Lars Ellenberg <lars.ellenberg at linbit.com>:
> On Fri, Feb 03, 2017 at 03:32:39PM +0900, Seiichirou Hiraoka wrote:
>> Hello.
>>
>> I use DRBD in the following environment.
>>
>> OS: Redhat Enterprise Linux 7.1
>> Pacemaker: 1.1.12 (CentOS Repository)
>> Corosync: 2.3.4 (CentOS Repository)
>> DRBD: 8.4.9 (ELRepo)
>> # rpm -qi drbd84-utils
>> Name        : drbd84-utils
>> Version     : 8.9.2
>> Release     : 2.el7.elrepo
>> Architecture: x86_64
>> Vendor: The ELRepo Project (http://elrepo.org)
>> # rpm -qi kmod-drbd84
>> Name        : kmod-drbd84
>> Version     : 8.4.9
>> Release     : 1.el7.elrepo
>> Architecture: x86_64
>>
>> Although DRBD is operated on two servers (server1, server2),
>> the following error message suddenly appears
>> and writing to the DRBD area can not be performed.
>>
>> . server1(master)
>> Jan 20 10:41:16 server1 kernel: block drbd0: local WRITE IO error sector 118616936+40 on dm-0
>> Jan 20 10:41:16 server1 kernel: block drbd0: disk( UpToDate -> Failed )
>> Jan 20 10:41:16 server1 kernel: block drbd0: Local IO failed in __req_mod. Detaching...
>> Jan 20 10:41:16 server1 kernel: block drbd0: 0 KB (0 bits) marked out-of-sync by on disk bit-map.
>> Jan 20 10:41:16 server1 kernel: block drbd0: disk( Failed -> Diskless )
>> Jan 20 10:41:16 server1 kernel: block drbd0: Got NegDReply; Sector 117512416s, len 4096.
>> Jan 20 10:41:16 server1 kernel: drbd0: WRITE SAME failed. Manually zeroing.
>
>
> That ^^ is the relevant hint.
>
> VMWare "virtual" disks seem to love to pretend to be able to do WRITE SAME,
> but when the actually see such requests, fail them with IO error.
> (Not blaming VMWare here, maybe other (real/virtual) disks show the same
> behavior. It's just the most frequent "offender" currently)
>
> That's not easy for DRBD to handle.
> Next DRBD release will have a config switch to turn off write-same
> support for specific DRBD volumes.
>
> Meanwhile, available work arounds:
>
> use a different type of virtual disk ("sata" may work), something that
> does not claim to support something it then does not handle.
>
> or, *early* in the boot process (before you bring up DRBD),
> disable write same like this:
> echo 0 | tee /sys/block/*/device/../*/scsi_disk/*/max_write_same_blocks
> (for the relevant backend devices)
>
> If you use LVM, you may need to vgchange -an ; vgchange -ay after that,
> (at least for the relevant VGs), if they have already been activated.
>
> --
> : Lars Ellenberg
> : LINBIT | Keeping the Digital World Running
> : DRBD -- Heartbeat -- Corosync -- Pacemaker
>
> DRBD® and LINBIT® are registered trademarks of LINBIT
> __
> please don't Cc me, but send to list -- I'm subscribed
> _______________________________________________
> drbd-user mailing list
> drbd-user at lists.linbit.com
> http://lists.linbit.com/mailman/listinfo/drbd-user