[DRBD-user] drbd turned on me, and bit my hand this morn

Brad Barnett lists at l8r.net
Tue Dec 6 20:39:20 CET 2005

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.

On Tue, 6 Dec 2005 19:46:35 +0100
Lars Ellenberg <Lars.Ellenberg at linbit.com> wrote:

> / 2005-12-06 13:17:25 -0500
> \ Brad Barnett:
> > > how much storage?
> > 
> > 1.2T
> so it needs to do meta data io for about 40 MB.
> it does so using synchronous io for each 512 byte sector.
> so it has to do about 70 thousand syncronous io operations...
> that may take some time, especially if something else runs on the box,
> or your io-scheduler is "too intelligent".
> I don't know how long, though.
> typically it should not take much more than a few minutes.
> we at LINBIT use DRBD-plus for that storage sizes, where "drbdadm up"
> succeeds within a few seconds even on several TB storage.
> to verify whether drbd is indeed "hung" or just slooowly initilizing for
> that storage size, you could
>  watch -n1 cat /proc/partitions
> and watch the read/write figures for the lower level device.

I use a meta disk though, because I wanted to be able to move to ext3
without worries.  This device was the primary, so.. what could it be
doing during startup?  The primary should already have all the data on the
partition as it sits, shouldn't it?

As for storage size, I'm not sure what you mean here by initializing.  The
drive was already initialized and prepped.. I was just trying to get the
drbd volume readable after a reboot.

> > At one point, I changed the config file to 10 seconds, for every time
> > listed there (including the wfc-timeout mentioned by Diego).  It
> > didn't seem to time out, and I waited for quite some time (at least 5
> > minutes).
> these timeouts are all not active, yet.
> your setup "apears to be hung" during device initialization.
> and it may well be that your lower level storage device is just slooow
> for sector-sized synchronous access.

I've used this setup for a couple of months now, and done quite a few
reboots (especially during the testing phase).  I was never this slow for
bootup.  It's always been instant.

This device _was_ the primary though, I'm not sure why it would take such
a long time for it to sync.  There's no secondary any more, so it was just
an attempt to get the device up and running.

This is only a suggestion, as I know you have about 1000 other things to
do.  However, it would be nice if /proc/drbd showed status information
during this setup stage.  

Anyhow, aside from all of this, I am worried about this:

"these timeouts are all not active, yet."

This is because there will definitely be times when the secondary will go
away for a long period of time.  I won't be able to have the primary down,
just waiting for the secondary to come back.  Does wfc-timeout work?  It
didn't seem to. :/

Without the wfc-timeout option, drbd may become unusable for me.  The
whole reason we went with it, is so that the secondary can disappear for a
while (upgrades, etc), then the primary (upgrades, etc).. and during this
time we would have full access to our nfs partition.

Now it seems like one power failure / issue could cause the box to reboot,
and wait forever.. leaving me with no recourse but to degrade to a pure
ext3 device mount...

Please tell me I'm wrong. :(

I can do all the testing you want on Thursday night..

> > I would modprobe the drbd module manually.
> > I still don't understand why it always shows two drbd devices though.
> > :/
> because, if you modprobe it manually, it defaults to "minor_count=2".
> if you let the script to it, it will count the number of actually used
> resources in the drbd.conf, and uses that value there; unless overridden
> by an entry in the global {} section.

Ah!  This is a mind easer. ;)  I was worried a bit.


More information about the drbd-user mailing list