[PATCH] drbd: fix "LOGIC BUG" in drbd_al_begin_io_nonblock()
Christoph Böhmwalder
christoph.boehmwalder at linbit.com
Thu Feb 19 15:20:12 CET 2026
From: Lars Ellenberg <lars.ellenberg at linbit.com>
Even though we check that we "should" be able to do lc_get_cumulative()
while holding the device->al_lock spinlock, it may still fail,
if some other code path decided to do lc_try_lock() with bad timing.
If that happened, we logged "LOGIC BUG for enr=...",
but still did not return an error.
The rest of the code now assumed that this request has references
for the relevant activity log extents.
The implcations are that during an active resync, mutual exclusivity of
resync versus application IO is not guaranteed. And a potential crash
at this point may not realizs that these extents could have been target
of in-flight IO and would need to be resynced just in case.
Also, once the request completes, it will give up activity log references it
does not even hold, which will trigger a BUG_ON(refcnt == 0) in lc_put().
Fix:
Do not crash the kernel for a condition that is harmless during normal
operation: also catch "e->refcnt == 0", not only "e == NULL"
when being noisy about "al_complete_io() called on inactive extent %u\n".
And do not try to be smart and "guess" whether something will work, then
be surprised when it does not.
Deal with the fact that it may or may not work. If it does not, remember a
possible "partially in activity log" state (only possible for requests that
cross extent boundaries), and return an error code from
drbd_al_begin_io_nonblock().
A latter call for the same request will then resume from where we left off.
Cc: stable at vger.kernel.org
Signed-off-by: Lars Ellenberg <lars.ellenberg at linbit.com>
Signed-off-by: Christoph Böhmwalder <christoph.boehmwalder at linbit.com>
---
drivers/block/drbd/drbd_actlog.c | 53 +++++++++++++-----------------
drivers/block/drbd/drbd_interval.h | 5 ++-
2 files changed, 27 insertions(+), 31 deletions(-)
diff --git a/drivers/block/drbd/drbd_actlog.c b/drivers/block/drbd/drbd_actlog.c
index 742b2908ff68..b3dbf6c76e98 100644
--- a/drivers/block/drbd/drbd_actlog.c
+++ b/drivers/block/drbd/drbd_actlog.c
@@ -483,38 +483,20 @@ void drbd_al_begin_io(struct drbd_device *device, struct drbd_interval *i)
int drbd_al_begin_io_nonblock(struct drbd_device *device, struct drbd_interval *i)
{
- struct lru_cache *al = device->act_log;
/* for bios crossing activity log extent boundaries,
* we may need to activate two extents in one go */
unsigned first = i->sector >> (AL_EXTENT_SHIFT-9);
unsigned last = i->size == 0 ? first : (i->sector + (i->size >> 9) - 1) >> (AL_EXTENT_SHIFT-9);
- unsigned nr_al_extents;
- unsigned available_update_slots;
unsigned enr;
- D_ASSERT(device, first <= last);
-
- nr_al_extents = 1 + last - first; /* worst case: all touched extends are cold. */
- available_update_slots = min(al->nr_elements - al->used,
- al->max_pending_changes - al->pending_changes);
-
- /* We want all necessary updates for a given request within the same transaction
- * We could first check how many updates are *actually* needed,
- * and use that instead of the worst-case nr_al_extents */
- if (available_update_slots < nr_al_extents) {
- /* Too many activity log extents are currently "hot".
- *
- * If we have accumulated pending changes already,
- * we made progress.
- *
- * If we cannot get even a single pending change through,
- * stop the fast path until we made some progress,
- * or requests to "cold" extents could be starved. */
- if (!al->pending_changes)
- __set_bit(__LC_STARVING, &device->act_log->flags);
- return -ENOBUFS;
+ if (i->partially_in_al_next_enr) {
+ D_ASSERT(device, first < i->partially_in_al_next_enr);
+ D_ASSERT(device, last >= i->partially_in_al_next_enr);
+ first = i->partially_in_al_next_enr;
}
+ D_ASSERT(device, first <= last);
+
/* Is resync active in this area? */
for (enr = first; enr <= last; enr++) {
struct lc_element *tmp;
@@ -529,14 +511,21 @@ int drbd_al_begin_io_nonblock(struct drbd_device *device, struct drbd_interval *
}
}
- /* Checkout the refcounts.
- * Given that we checked for available elements and update slots above,
- * this has to be successful. */
+ /* Try to checkout the refcounts. */
for (enr = first; enr <= last; enr++) {
struct lc_element *al_ext;
al_ext = lc_get_cumulative(device->act_log, enr);
- if (!al_ext)
- drbd_info(device, "LOGIC BUG for enr=%u\n", enr);
+
+ if (!al_ext) {
+ /* Did not work. We may have exhausted the possible
+ * changes per transaction. Or raced with someone
+ * "locking" it against changes.
+ * Remember where to continue from.
+ */
+ if (enr > first)
+ i->partially_in_al_next_enr = enr;
+ return -ENOBUFS;
+ }
}
return 0;
}
@@ -556,7 +545,11 @@ void drbd_al_complete_io(struct drbd_device *device, struct drbd_interval *i)
for (enr = first; enr <= last; enr++) {
extent = lc_find(device->act_log, enr);
- if (!extent) {
+ /* Yes, this masks a bug elsewhere. However, during normal
+ * operation this is harmless, so no need to crash the kernel
+ * by the BUG_ON(refcount == 0) in lc_put().
+ */
+ if (!extent || extent->refcnt == 0) {
drbd_err(device, "al_complete_io() called on inactive extent %u\n", enr);
continue;
}
diff --git a/drivers/block/drbd/drbd_interval.h b/drivers/block/drbd/drbd_interval.h
index 366489b72fe9..5d3213b81eed 100644
--- a/drivers/block/drbd/drbd_interval.h
+++ b/drivers/block/drbd/drbd_interval.h
@@ -8,12 +8,15 @@
struct drbd_interval {
struct rb_node rb;
sector_t sector; /* start sector of the interval */
- unsigned int size; /* size in bytes */
sector_t end; /* highest interval end in subtree */
+ unsigned int size; /* size in bytes */
unsigned int local:1 /* local or remote request? */;
unsigned int waiting:1; /* someone is waiting for completion */
unsigned int completed:1; /* this has been completed already;
* ignore for conflict detection */
+
+ /* to resume a partially successful drbd_al_begin_io_nonblock(); */
+ unsigned int partially_in_al_next_enr;
};
static inline void drbd_clear_interval(struct drbd_interval *i)
base-commit: 72f4d6fca699a1e35b39c5e5dacac2926d254135
--
2.52.0
More information about the drbd-dev
mailing list