[DRBD-user] Question reg. protocol C

Lars Ellenberg lars.ellenberg at linbit.com
Tue Sep 12 00:03:27 CEST 2017

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.

On Mon, Sep 11, 2017 at 06:00:08PM +1000, Igor Cicimov wrote:
> On 11 Sep 2017 4:20 pm, "Ravi Kiran Chilakapati" <
> ravikiran.chilakapati at gmail.com> wrote:
> Thank you for the response Roland.
> I will start going through the source code. In the meantime, it will be
> great if these preliminary questions can be answered.
> Q: Is Protocol C a variant of any standard atomic commit protocol (like
> 2PC/3PC etc.)? Or is it a proprietary algorithm?

Not at all.

Very simple.

Write request reaches DRBD.

Request is passed down to local storage, if any,
request is forwarded to all currently connected peers

Local IO completions comes in eventually
peer write-acks (or, for failed remote writes, neg-acks)
come in eventually

If peer acks do NOT come in,
then connection loss will be declared at some point,
and if not yet received, "neg-acks" will be "faked".

All acks there: great, we can complete to upper layers.
Usually we complete to upper layers as OK if at least one write was successful.
(similar to what a raid1 would do).

If we see a disk error, we usually (can be configured) detach.

If we see a connection loss, we usually keep going,
but can freeze and call a "fence-peer" handler.

If all disks fail, we usually have detached from all of them now,
and either now propagate IO errors to upper layers
(again, think raid1 and all disks have failed),
or can again freeze (and hope that somehow, access to the disk we lost
last is restored, with all previously "successfully" complete writes
still intact)

very recently we also introduced a "quorum" concept, which on loss of
"quorum" is supposed to freeze, or, if you so choose, throw IO errors.

> Q: Let's assume there are 2 disks (D1, D2). Let's assume that D2 is
> experiencing a fail-recover situation,

"fail-recover situation" ...  What would that be?
DRBD cannot "recover" a "failed" disk.
How would it do that?

"typically" (the supported way):
D2 fails, we detach from it.
Now it is no longer there.

> but D1 has failed after a D2
> failure,

Too bad, now we have no data anymore.

> but before D2 has recovered. What is the behavior of DRBD in such
> a case? Are all future disk writes blocked until both D1 and D2 are
> available, and are confirmed to be in sync?

"it depends".

Typically, you now see IO errors in the upper layers.

: Lars Ellenberg
: LINBIT | Keeping the Digital World Running
: DRBD -- Heartbeat -- Corosync -- Pacemaker

DRBD® and LINBIT® are registered trademarks of LINBIT
please don't Cc me, but send to list -- I'm subscribed

More information about the drbd-user mailing list