Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
/ 2005-12-06 14:39:20 -0500 \ Brad Barnett: > I use a meta disk though, because I wanted to be able to move to > ext3 without worries. This device was the primary, so.. what > could it be doing during startup? The primary should already > have all the data on the partition as it sits, shouldn't it? > > As for storage size, I'm not sure what you mean here by > initializing. The drive was already initialized and prepped.. > I was just trying to get the drbd volume readable after a > reboot. drbd uses meta data, the amount of which is about proportional to actual storage size. independently of where this meta data is stored, it has to be read (and updated) on "drbdadm up" time. this is "drbd device initialization". it takes some time, some more time for bigger storage. it may apear to take "forever" on drbd of several TB storage when the lower level raid is rebuilding at the same time, and the io-scheduler/driver/controller combination optimizes for high throughput, and thus kills performance when you need low latency instead. > This is only a suggestion, as I know you have about 1000 other > things to do. However, it would be nice if /proc/drbd showed > status information during this setup stage. we'll have a look at this. > Anyhow, aside from all of this, I am worried about this: > > "these timeouts are all not active, yet." as I tried to point out: all configurable timeouts are for drbd _network_ connection/dialog things. your setup seemed to be stuck in the drbd _disk_ attach/initialization phase, and there are no timeouts to tune there. we need to find out where and why exactly it gets stuck, and fix that. my suggestion is that it does not really get stuck but just initializes very slowly, for whatever reason. if we can confirm this, and find the reason, we probably can fix it there. and this is most likely not a generic problem with drbd, but a special one for your setup. > This is because there will definitely be times when the > secondary will go away for a long period of time. I won't be > able to have the primary down, just waiting for the secondary to > come back. Does wfc-timeout work? It didn't seem to. :/ as pointed out before: you currently don't even get to the "I am ready to talk to my peer, lets see if I get a timeout trying to do so" stage. anyways, if the primary runs, and you take the secondary down, the primary happily hums along further. typically. > Without the wfc-timeout option, drbd may become unusable for me. > The whole reason we went with it, is so that the secondary can > disappear for a while (upgrades, etc), then the primary > (upgrades, etc).. and during this time we would have full access > to our nfs partition. > > Now it seems like one power failure / issue could cause the box > to reboot, and wait forever.. leaving me with no recourse but to > degrade to a pure ext3 device mount... that is how we do it all the time. though, we tend to NOT use wfc-timeout at all, because you really risk data integrity that way. you always should be able to answer "yes" at that drbdadm prompt. of course, at remote installations you should have a terminal server, power switch, tell grub about the terminal switch, make sure that you are able to ssh into the box asap during the boot phase etc. -- : Lars Ellenberg Tel +43-1-8178292-0 : : LINBIT Information Technologies GmbH Fax +43-1-8178292-82 : : Schoenbrunner Str. 244, A-1120 Vienna/Europe http://www.linbit.com : __ please use the "List-Reply" function of your email client.