[DRBD-announce] drbd-9.0.14

Lars Ellenberg lars.ellenberg at linbit.com
Wed May 2 13:58:38 CEST 2018

On Tue, Apr 17, 2018 at 03:24:08PM +0200, Philipp Reisner wrote:
> Hi,
> This is a strongly recommended update for all drbd-9.0.x users.
> It contains serve fixes for for cases with multiple diskless
> nodes. Without these fixes you can even see wrong data read back from
> DRBD under complicated failure cases.
> We slightly delayed the release to finish compatibility with
> the kernel of the recently released RHEL-7.5.

And here is the DRBD 9.0.14 release,
fixing a couple regressions we introduced in 9.0.13, like
auto-split-brain recovery handlers not being invoked; but you want
to *avoid* split brain and data divergence in the first place...

Excerpt from the ChangeLog
9.0.14-1 (api:genl2/proto:86-113/transport:14)
 * fix regression in 9.0.13: call after-split-brain-recovery handlers
   no auto-recovery strategies (not even the default: disconnect) would be
   applied, nodes would stay connected and all nodes would try to become the
   source of the resync.
 * fix spurious temporary promotion failure: if after Primary loss
   failover happened too quickly, transparently retry internally.
 * fixup recently introduced P_ZEROES to actually work as intended
 * fix online-verify to account for skipped blocks; otherwise, it won't
   notice that it has finished, apparently being stuck near "100% done"
 * expose more resync and online-verify statistics and details
 * improve accounting of "in-flight" data and resync requests
 * allow taking down an already useless minor device during "down",
   even if it is (temporarily) opened by for example udev scanning
 * fix for a node staying "only" Consistent and not returning to UpToDate
   in certain scenarios when fencing is enabled
 * fix data generation UUID propagate during resync
 * compat for upstream kernels up to v4.17


To take advantage of the more detailed resync and finally correct
online-verify stats, you will need to update your drbd-utils to 9.4,
which we expect to release later this week.

I'll leave the 9.0.13 changelog here for reference as well:

> 9.0.13-1 (api:genl2/proto:86-113/transport:14)
> --------
>  * abort a resync if a resync source becomes weakly connected and the
>    sync target is a neighbor of the primary; the lack of doing so was
>    a possible source of data corruption
>  * fix UUID handling with multiple diskless nodes; If the primary role
>    is moved between them, and no write happens before the storage
>    nodes are disconnect; before this fix the storage nodes would outdate
>    themselves upon reconnect
>  * When a data-set gets into contact (attach or connect) with an all
>    diskless cluster with a primary and the exposed UUID does not match
>    the arriving data-set, make sure to either set it to "Consistent"
>    or to reject the attach
>  * correctly handle when a node that was marked as intentional diskless
>    should get a disk; allocate bitmap slots when the --bitmap=no flag
>    gets removed; reject peers to attach if they are marked with --bitmap=no
>  * fix outdating of weakly connected nodes; It was broken when an already
>    primary node joins the cluster at the other end
>  * made returning from Ahead to SyncSource more reliable; the old code
>    may have missed the event if the write to the local backend was still
>    pending when the barrier-ack comes in
>  * fix a hard to trigger deadlock in the receiver; it triggered sometimes
>    on the Secondary if a resync was going on and writes on the primary
>    happen to the same area while the connection is interrupted; it caused
>    the device to be stuck in "NetworkFailure" state
>  * fix online resize in the presence of two or more diskless nodes
>  * fix online add of volumes to diskless nodes when it already has
>    established connections
>  * Set the SO_KEEPALIVE socket option on data sockets. Can be important
>    if long lived DRBD connections go through a firewall with connection
>    tracking
>  * automatically solve a specific split brain when quorum is enabled
>    and a node does no IO between losing connections to other nodes
>  * Compat: Drop support for kernels older 2.6.32 and distros older than
>    RHEL6; Added support for kernels up to v4.15.x
>  * new wire packet P_ZEROES a cousin of P_DISCARD, following the kernel
>    as it introduced separated BIO ops for writing zeros and discarding
>  * compat workaround for two RHEL 7.5 idiosyncrasies regarding refcount_t
>    and struct nla_policy

: Lars Ellenberg
: LINBIT | Keeping the Digital World Running
: DRBD -- Heartbeat -- Corosync -- Pacemaker

DRBD® and LINBIT® are registered trademarks of LINBIT

More information about the drbd-announce mailing list