[Drbd-dev] [PATCH 1/3] drbd: remove WRITE_SAME support
Michael Labriola
veggiemike at sourceruckus.org
Mon Sep 12 21:08:04 CEST 2022
On Fri, Sep 9, 2022 at 9:04 AM Lars Ellenberg <lars.ellenberg at linbit.com> wrote:
>
> On Fri, Sep 09, 2022 at 09:57:23AM +0200, Joel Colledge wrote:
> > Hi Michael,
> >
> > Thanks for the patch!
> >
> > > This commit mimics upstream commit a34592ff6b78, which removes all the
> > > WRITE_SAME code because "REQ_OP_WRITE_SAME was only ever submitted by
> > > the legacy Linux zeroing code, which has switched to use
> > > REQ_OP_WRITE_ZEROES long ago." WRITE_SAME was then removed from Linux
> > > 5.18.
> >
> > This is an interesting case. If we remove all the write-same code like
> > this, then we either need to add it back in via compat, or we lose a
> > feature. I think it is OK to lose the feature on 5.18+ kernels, since
> > the rest of the kernel does not use it any more. On older kernels this
> > might break real use cases.
> >
> > There is also the case where our peer is running an older kernel and
> > we are running 5.18+. I think we should worry about that after
> > deciding what to do with peers that run the same kernel.
> >
> > I've added Lars to CC because he is more familiar with the historical
> > changes in this area.
> >
> > Note: If we do remove the feature entirely, we should stop advertising
> > the feature flag DRBD_FF_WSAME so that the peer knows that we do not
> > support it.
>
> Problem with that is: I overloaded this "feature flag":
> /* supports REQ_WRITE_SAME on the "wire" protocol.
> * Note: this flag is overloaded,
> * its presence also
> * - indicates support for 128 MiB "batch bios",
> * max discard size of 128 MiB
> * instead of 4M before that.
> * - indicates that we exchange additional settings in p_sizes
> * drbd_send_sizes()/receive_sizes()
> */
Yeah, I stumbled onto that while on my first cut of this patch...
quickly backpedaled and did as little as possible in case there was
any other unanticipated ripple.
>
> I did that to avoid too much special casing with protocol version numbers,
> we had 8.4 and 9 version ranges, and introduced the feature on both.
>
> Takeaway: do not overload feature flags :-|
>
> My original commit message:
> drbd: introduce WRITE_SAME support
>
> We will support WRITE_SAME, if
> * all peers support WRITE_SAME (both in kernel and DRBD version),
> * all peer devices support WRITE_SAME
> * logical_block_size is identical on all peers.
>
> We may at some point introduce a fallback on the receiving side
> for devices/kernels that do not support WRITE_SAME,
> by open-coding a submit loop. But not yet.
>
> I don't think "open code a compat fallback" is useful,
> the only "relevant" usage of WSAME was ZERO-OUT,
> which got its own flag in upstream kernel (and then DRBD) in 2018.
>
> I don't think "WSAME" compat accross 8.4 <-> 9 is something we need to worry
> about, I'm fine with not supporting it or the other things it indicated when
> connecting 8.4 <-> 9, which should only be done temporarily.
> Most of the time a "stop everything; upgrade all nodes; start everything"
> will be more effective and have shorter overall downtime than a rolling upgrade
> one-by-one, moving services by stopping, then starting on an other node.
>
> But actually we do not need to remove the DRBD_FF_WSAME flag, because WSAME
> will not be used even if the feature flag is present, if we don't indicate
> support for it in the p_sizes per "peer device" aka volume.
> we now always report p->qlim->write_same_capable = 0.
>
> ==> I think the patch is okay as is.
Ok, good to hear. I was a lot less sure of this one than other
"keeping up with mainline" patches I've worked on in the past.
I think I've gotten a little lost regarding the need for a compat
patch, though. Are you saying that because the only real benefit of
WSAME is now taken care of by ZERO-OUT, there really is no need for a
compat patch to put WSAME back into the drbd9 tree (for pre-5.18
kernels)?
>
> If we want, we can just rename the flag,
> %s/DRBD_FF_WSAME/DRBD_FF_EXTENDED_QLIM/g or something.
>
> Should we want to drop the flag,
> for 9 <-> 9, git log (below) says that we can use PRO_VERSION >= 111
> to indicate "128 MiB batch bio size" and "extended p_sizes".
>
> 2021-07-30 95adb51a1 drbd: Allow reverting resync direction by usind the --discard-my-data flag #define PRO_VERSION_MAX 121
> 2021-04-14 43a7362fa drbd: new option for "invalidate(-remote)" commands: "--reset-bitmap=no" #define PRO_VERSION_MAX 120
> 2021-01-25 4f2682e38 drbd: Generalize when resync direction is based on disk states #define PRO_VERSION_MAX 119
> 2020-10-16 e595bfc63 drbd: Serialize connects and promotes properly #define PRO_VERSION_MAX 118
> 2020-06-19 1f59029d7 drbd: Send config details, size, data UUIDs, and initial state before first 2pc #define PRO_VERSION_MAX 117
> 2019-10-23 2f144892f drbd: Allow resync of a re-created peer from day0 UUID #define PRO_VERSION_MAX 116
> 2019-05-24 50124b2b9 drbd: introduce P_RS_CANCEL_AHEAD #define PRO_VERSION_MAX 115
> 2018-07-19 3d98754ed drbd: protocol 114: fix distributed deadlock on secondary activity log #define PRO_VERSION_MAX 114
> 2018-04-11 6036fafcc drbd: introduce P_ZEROES (REQ_OP_WRITE_ZEROES on the "wire")
> 2018-03-13 d987c27a7 drbd: Automatically resolve split-brain in a specific case with quorum #define PRO_VERSION_MAX 113
> 2016-08-29 6708bd087 drbd: two-phase resize #define PRO_VERSION_MAX 112
> 2016-01-28 399b81433 fixed resize for multiple nodes #define PRO_VERSION_MAX 111
> 2015-12-04 aeee107ec drbd: introduce WRITE_SAME support
> 2012-01-24 6a4b0c000 drbd: Implement cluster-wide state changes using two-phase commit #define PRO_VERSION_MAX 110
>
> _______________________________________________
> drbd-dev mailing list
> drbd-dev at lists.linbit.com
> https://lists.linbit.com/mailman/listinfo/drbd-dev
More information about the drbd-dev
mailing list