[DRBD-user] Unable to reconnect the secondary after shrinking the primary

Lars Ellenberg Lars.Ellenberg at linbit.com
Sun Oct 15 17:53:08 CEST 2006

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


/ 2006-10-13 17:48:55 +0300
\ Cyril Bouthors:
> On 13 Oct 2006, Lars.Ellenberg at linbit.com wrote:
> 
> >> DRBD should auto-detect this and work if not explicitely specified.
> >
> > should autodetect _what_ ?
> 
> The disk size.

in fact, drbd _does_ detect the disksize automatically.
so I suspect in your case, that the _actual_ disk size
still has been larger than what you specified (the "user provided size"),
and that you probably even restarted the drbd on the node you did the
resize first, before you reconnected the nodes. or something like that.

this is obviously something we did not expect (we did not even expect
that someone would want to shrink his device, I think some file systems
can not even handle shrinking at all...)

the intended usage of drbdadm resize is to grow,
and it is expected to happen in "Connected Primary/Secondary" state.

shrinking in that state should be possible, too.

that it is possible to shrink in StandAlone mode is an undocumented
feature^Wbug...  it sometimes comes in handy, but as you happen to know
may lead to interessting problems.

anyways, sorry for having things not implemented like you want to use
them. I'm bussy with other things currently, so if you want to think
about an algorithm that handles each and every corner case...


here is what we have:
drbd knows the actual "physical" size of its lower level device.
from the meta data it knows the "last agreed" size, this is the size
it has agreed upon with the peer the last time it was connected.
this could be "unknown" (zero) when we never have been connected.
it optionally knows the "user provided" size, which is the size you
provided with "drbdsetup disk". usually this user size is not given at
all, and initialized to zero (not-effective).

the relationship would be
  (lower) (last agreed) (user)
  lo_size >= la_size >= u_size

now, uppon connection, you need to "agree" on some size, since drbd has
to be the same size on both nodes.

while they are connected, you can request that they readjust their size,
in case any lo_size had changed, or you can request that the readjust
their size to something you specify (u_size), in which case this u_size
may be larger than the la_size. it cannot be larger than the minimum of
both lo_size, obviously.

currently we communicate in drbd 0.7 the lo_size (since that is the
upper limit we can maximally agree upon) and the u_size, which
normally is zero (unset). we _assume_ that the la_size is the same on
both nodes or unset...

I think this assumtion was wrong in your case, since you tampered with
that value without communicating it, and that was the cause of your
trouble, and it was resolved by you communicating this (by putting the
disk-size in the config file).

so our current algorithm is, uppon "drbdsetup ... disk ..."
to read la_size from meta data, and take u_size from command line, if given,
and assign a provisional device size of min_not_zero(la_size,u_size),
unless lo_size (queried from the lower level device) would be even
smaller.

uppon connection, we communicate u_size (may be zero/unset) and lo_size,
and "agree" on the minimimal value that is set.
but, in case we are primary, we refuse to shrink there, otherwise a
misconfigured secondary would "truncate" a live filesystem...

the relevant code is in drbd_ioctl_set_disk,
in drbd_ioctl case DRBD_IOCTL_SET_DISK_SIZE,
in drbd_send_param, and in receive_param.

may well be that there is some logic bug in there somewhere, based on
wrong assumptions or the like, or that it is only "not complete" and
we'd have to communicate more information...

for those cases I could make up it seemd to work...

-- 
: Lars Ellenberg                                  Tel +43-1-8178292-0  :
: LINBIT Information Technologies GmbH            Fax +43-1-8178292-82 :
: Schoenbrunner Str. 244, A-1120 Vienna/Europe   http://www.linbit.com :
__
please use the "List-Reply" function of your email client.



More information about the drbd-user mailing list