[Drbd-dev] [PATCH 1/3] drbd: remove WRITE_SAME support
Lars Ellenberg
lars.ellenberg at linbit.com
Fri Sep 9 15:04:01 CEST 2022
On Fri, Sep 09, 2022 at 09:57:23AM +0200, Joel Colledge wrote:
> Hi Michael,
>
> Thanks for the patch!
>
> > This commit mimics upstream commit a34592ff6b78, which removes all the
> > WRITE_SAME code because "REQ_OP_WRITE_SAME was only ever submitted by
> > the legacy Linux zeroing code, which has switched to use
> > REQ_OP_WRITE_ZEROES long ago." WRITE_SAME was then removed from Linux
> > 5.18.
>
> This is an interesting case. If we remove all the write-same code like
> this, then we either need to add it back in via compat, or we lose a
> feature. I think it is OK to lose the feature on 5.18+ kernels, since
> the rest of the kernel does not use it any more. On older kernels this
> might break real use cases.
>
> There is also the case where our peer is running an older kernel and
> we are running 5.18+. I think we should worry about that after
> deciding what to do with peers that run the same kernel.
>
> I've added Lars to CC because he is more familiar with the historical
> changes in this area.
>
> Note: If we do remove the feature entirely, we should stop advertising
> the feature flag DRBD_FF_WSAME so that the peer knows that we do not
> support it.
Problem with that is: I overloaded this "feature flag":
/* supports REQ_WRITE_SAME on the "wire" protocol.
* Note: this flag is overloaded,
* its presence also
* - indicates support for 128 MiB "batch bios",
* max discard size of 128 MiB
* instead of 4M before that.
* - indicates that we exchange additional settings in p_sizes
* drbd_send_sizes()/receive_sizes()
*/
I did that to avoid too much special casing with protocol version numbers,
we had 8.4 and 9 version ranges, and introduced the feature on both.
Takeaway: do not overload feature flags :-|
My original commit message:
drbd: introduce WRITE_SAME support
We will support WRITE_SAME, if
* all peers support WRITE_SAME (both in kernel and DRBD version),
* all peer devices support WRITE_SAME
* logical_block_size is identical on all peers.
We may at some point introduce a fallback on the receiving side
for devices/kernels that do not support WRITE_SAME,
by open-coding a submit loop. But not yet.
I don't think "open code a compat fallback" is useful,
the only "relevant" usage of WSAME was ZERO-OUT,
which got its own flag in upstream kernel (and then DRBD) in 2018.
I don't think "WSAME" compat accross 8.4 <-> 9 is something we need to worry
about, I'm fine with not supporting it or the other things it indicated when
connecting 8.4 <-> 9, which should only be done temporarily.
Most of the time a "stop everything; upgrade all nodes; start everything"
will be more effective and have shorter overall downtime than a rolling upgrade
one-by-one, moving services by stopping, then starting on an other node.
But actually we do not need to remove the DRBD_FF_WSAME flag, because WSAME
will not be used even if the feature flag is present, if we don't indicate
support for it in the p_sizes per "peer device" aka volume.
we now always report p->qlim->write_same_capable = 0.
==> I think the patch is okay as is.
If we want, we can just rename the flag,
%s/DRBD_FF_WSAME/DRBD_FF_EXTENDED_QLIM/g or something.
Should we want to drop the flag,
for 9 <-> 9, git log (below) says that we can use PRO_VERSION >= 111
to indicate "128 MiB batch bio size" and "extended p_sizes".
2021-07-30 95adb51a1 drbd: Allow reverting resync direction by usind the --discard-my-data flag #define PRO_VERSION_MAX 121
2021-04-14 43a7362fa drbd: new option for "invalidate(-remote)" commands: "--reset-bitmap=no" #define PRO_VERSION_MAX 120
2021-01-25 4f2682e38 drbd: Generalize when resync direction is based on disk states #define PRO_VERSION_MAX 119
2020-10-16 e595bfc63 drbd: Serialize connects and promotes properly #define PRO_VERSION_MAX 118
2020-06-19 1f59029d7 drbd: Send config details, size, data UUIDs, and initial state before first 2pc #define PRO_VERSION_MAX 117
2019-10-23 2f144892f drbd: Allow resync of a re-created peer from day0 UUID #define PRO_VERSION_MAX 116
2019-05-24 50124b2b9 drbd: introduce P_RS_CANCEL_AHEAD #define PRO_VERSION_MAX 115
2018-07-19 3d98754ed drbd: protocol 114: fix distributed deadlock on secondary activity log #define PRO_VERSION_MAX 114
2018-04-11 6036fafcc drbd: introduce P_ZEROES (REQ_OP_WRITE_ZEROES on the "wire")
2018-03-13 d987c27a7 drbd: Automatically resolve split-brain in a specific case with quorum #define PRO_VERSION_MAX 113
2016-08-29 6708bd087 drbd: two-phase resize #define PRO_VERSION_MAX 112
2016-01-28 399b81433 fixed resize for multiple nodes #define PRO_VERSION_MAX 111
2015-12-04 aeee107ec drbd: introduce WRITE_SAME support
2012-01-24 6a4b0c000 drbd: Implement cluster-wide state changes using two-phase commit #define PRO_VERSION_MAX 110
More information about the drbd-dev
mailing list