Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
On Thu, Jan 12, 2017 at 06:00:53PM +0100, Lars Ellenberg wrote: > On Wed, Jan 11, 2017 at 06:23:08PM +0100, knebb at knebb.de wrote: > > Hi Lars and all, > > > > > > >> I have to cross-post to LVM as well to DRBD mailing list as I have no > > >> clue where the issue is- if it's not a bug... > > >> > > >> I can not get working LVM on top of drbd- I am getting I/O erros > > >> followed by "diskless" state. > > > For some reason, (some? not only?) VMWare virtual disks tend to pretend > > > to support "write same", even if they fail such requests later. > > > > > > DRBD treats such failed WRITE-SAME the same way as any other backend > > > error, and by default detaches. > > Ok, it is beyond my knowledge, but I understand what the "write-same" > > command does. But if the underlying physical disk offers the command and > > reports an error when used this should apply to mkfs.ext4 on the device/ > > partition as well, shouldn't it? > > In this case, it happens on first mount. > Also, it is not an "EIO", but an "EOPNOTSUP". > > What really happens is that the file system code calls > blkdev_issue_zeroout(), > which will try discard, if discard is available and discard zeroes data, > or, if discard (with discard zeroes data) is not available or returns > failure, tries write-same with ZERO_PAGE, > or, if write-same is not available or returns failure, > tries __blkdev_issue_zeroout() (which uses "normal" writes). > > At least in "current upstream", probably very similar in your > almost-3.10.something kernel. > > DRBD sits in between, sees the failure return of write-same, > and handles it by detaching. > > > drbd detacheds when an error is > > reported- but why does Linux not report an error without drbd? And why > > does this only happen when using LVM in-between? Should be the same when > > LVM is not used.... > > Yes. And it is, as far as I can tell. > > > > Older kernels (RHEL 6) and also older drbd (8.3) are not affected, because they > > > don't know about write-same. > > My primary host is running CentOS7 while the secondary ist older > > (CentOS6). I will try to create the ext4 on the secondary and then > > switch to primary. > > > > > Or tell the system that the backend does not support write-same: > > > Check setting: > > > grep ^ /sys/block/*/device/scsi_disk/*/max_write_same_blocks > > > disable: > > > echo 0 | tee /sys/block/*/device/scsi_disk/*/max_write_same_blocks > > > > > A "find /sys -name "*same*"" does not report any files named > > double check that, please. > all my centos7 / RHEL 7 (and other distributions with sufficiently new > kernel) have that. > > there are both the read-only /sys/block/*/queue/write_same_max_bytes > and the write-able /sys/devices/*/*/*/host*/target*/*/scsi_disk/*/max_write_same_blocks > > > "max_write_same_blocks". On none of the both nodes. So I dcan not > > disable nor verify if it's enabled. I assume no as it does not exist. So > > this might not be the reason. > > show us lsblk -t and lsblk -D from the box that detaches. > (the "7" one) > > It may also be that a discard failed, in which case it could be > devicemapper pretending discard was supported, and the backend failing > that discard request. Or some combination there. > > Your original logs show > > Jan 7 10:58:44 backuppc kernel: EXT4-fs (dm-2): mounted filesystem with ordered data mode. Opts: (null) > > Jan 7 10:58:48 backuppc kernel: block drbd1: local WRITE IO error sector 5296+3960 on sdc > > The "+..." part is the length (number of sectors) of the request. > We don't allow "normal" requests of that size, so this is either a > discard or write-same. > > > Jan 7 10:58:48 backuppc kernel: block drbd1: disk( UpToDate -> Failed ) > > > Jan 7 10:58:48 backuppc kernel: block drbd1: IO ERROR: neither local nor remote data, sector 29096+3968 > > > Jan 7 10:58:48 backuppc kernel: dm-2: WRITE SAME failed. Manually zeroing. > > And here we see that at least some WRITE SAME was issued, and returned failure. > and device mapper, which in your case sits above DRBD, > and consumes that error, has its own fallback code for failed write-same. Correcting myself, the presence of the warning message misled me. The 3.10 kernel still has that warning message directly in blkdev_issue_zeroout(), so that's not the device mapper fallback, but simply the mechanism I described above, with additional "log that I took the fallback because of failure". Which means DISCARDS have not even been tried, or we'd have a message about that as well. > Which can no longer be services, because DRBD already detached. > > So yes, > I'm pretty sure that I did not pull my "best guess" out of thin air only > > ;-) -- : Lars Ellenberg : LINBIT | Keeping the Digital World Running : DRBD -- Heartbeat -- Corosync -- Pacemaker DRBD® and LINBIT® are registered trademarks of LINBIT __ please don't Cc me, but send to list -- I'm subscribed