[Drbd-dev] [PATCH 1/3] drbd: remove WRITE_SAME support

Lars Ellenberg lars.ellenberg at linbit.com
Fri Sep 9 15:04:01 CEST 2022


On Fri, Sep 09, 2022 at 09:57:23AM +0200, Joel Colledge wrote:
> Hi Michael,
> 
> Thanks for the patch!
> 
> > This commit mimics upstream commit a34592ff6b78, which removes all the
> > WRITE_SAME code because "REQ_OP_WRITE_SAME was only ever submitted by
> > the legacy Linux zeroing code, which has switched to use
> > REQ_OP_WRITE_ZEROES long ago."  WRITE_SAME was then removed from Linux
> > 5.18.
> 
> This is an interesting case. If we remove all the write-same code like
> this, then we either need to add it back in via compat, or we lose a
> feature. I think it is OK to lose the feature on 5.18+ kernels, since
> the rest of the kernel does not use it any more. On older kernels this
> might break real use cases.
> 
> There is also the case where our peer is running an older kernel and
> we are running 5.18+. I think we should worry about that after
> deciding what to do with peers that run the same kernel.
> 
> I've added Lars to CC because he is more familiar with the historical
> changes in this area.
> 
> Note: If we do remove the feature entirely, we should stop advertising
> the feature flag DRBD_FF_WSAME so that the peer knows that we do not
> support it.

Problem with that is: I overloaded this "feature flag":
/* supports REQ_WRITE_SAME on the "wire" protocol.
 * Note: this flag is overloaded,
 * its presence also
 *   - indicates support for 128 MiB "batch bios",
 *     max discard size of 128 MiB
 *     instead of 4M before that.
 *   - indicates that we exchange additional settings in p_sizes
 *     drbd_send_sizes()/receive_sizes()
 */

I did that to avoid too much special casing with protocol version numbers,
we had 8.4 and 9 version ranges, and introduced the feature on both.

Takeaway: do not overload feature flags :-|

My original commit message:
    drbd: introduce WRITE_SAME support
    
    We will support WRITE_SAME, if
     * all peers support WRITE_SAME (both in kernel and DRBD version),
     * all peer devices support WRITE_SAME
     * logical_block_size is identical on all peers.
    
    We may at some point introduce a fallback on the receiving side
    for devices/kernels that do not support WRITE_SAME,
    by open-coding a submit loop. But not yet.

I don't think "open code a compat fallback" is useful,
the only "relevant" usage of WSAME was ZERO-OUT,
which got its own flag in upstream kernel (and then DRBD) in 2018.

I don't think "WSAME" compat accross 8.4 <-> 9 is something we need to worry
about, I'm fine with not supporting it or the other things it indicated when
connecting 8.4 <-> 9, which should only be done temporarily.
Most of the time a "stop everything; upgrade all nodes; start everything"
will be more effective and have shorter overall downtime than a rolling upgrade
one-by-one, moving services by stopping, then starting on an other node.

But actually we do not need to remove the DRBD_FF_WSAME flag, because WSAME
will not be used even if the feature flag is present, if we don't indicate
support for it in the p_sizes per "peer device" aka volume.
we now always report p->qlim->write_same_capable = 0.

==> I think the patch is okay as is.

If we want, we can just rename the flag,
%s/DRBD_FF_WSAME/DRBD_FF_EXTENDED_QLIM/g or something.

Should we want to drop the flag,
for 9 <-> 9, git log (below) says that we can use PRO_VERSION >= 111
to indicate "128 MiB batch bio size" and "extended p_sizes".

2021-07-30 95adb51a1 drbd: Allow reverting resync direction by usind the --discard-my-data flag        #define PRO_VERSION_MAX 121
2021-04-14 43a7362fa drbd: new option for "invalidate(-remote)" commands: "--reset-bitmap=no"          #define PRO_VERSION_MAX 120
2021-01-25 4f2682e38 drbd: Generalize when resync direction is based on disk states                    #define PRO_VERSION_MAX 119
2020-10-16 e595bfc63 drbd: Serialize connects and promotes properly                                    #define PRO_VERSION_MAX 118
2020-06-19 1f59029d7 drbd: Send config details, size, data UUIDs, and initial state before first 2pc   #define PRO_VERSION_MAX 117
2019-10-23 2f144892f drbd: Allow resync of a re-created peer from day0 UUID                            #define PRO_VERSION_MAX 116
2019-05-24 50124b2b9 drbd: introduce P_RS_CANCEL_AHEAD                                                 #define PRO_VERSION_MAX 115
2018-07-19 3d98754ed drbd: protocol 114: fix distributed deadlock on secondary activity log            #define PRO_VERSION_MAX 114
2018-04-11 6036fafcc drbd: introduce P_ZEROES (REQ_OP_WRITE_ZEROES on the "wire")
2018-03-13 d987c27a7 drbd: Automatically resolve split-brain in a specific case with quorum            #define PRO_VERSION_MAX 113
2016-08-29 6708bd087 drbd: two-phase resize                                                            #define PRO_VERSION_MAX 112
2016-01-28 399b81433 fixed resize for multiple nodes                                                   #define PRO_VERSION_MAX 111
2015-12-04 aeee107ec drbd: introduce WRITE_SAME support
2012-01-24 6a4b0c000 drbd: Implement cluster-wide state changes using two-phase commit                 #define PRO_VERSION_MAX 110



More information about the drbd-dev mailing list