Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Just to add to this. When drbdadm is in this "locked" state, I can not kill it, or kill -9 it. This means that I must reboot the box, instead of rmmod the module and make my configuration changes. Because of this, I do not think this is the expected behaviour. I'm using 0.7.10-3 (The latest Debian stable version), and found the same behaviour with 0.7.14 freshly installed from tarballs. On Tue, 6 Dec 2005 10:46:30 -0500 Brad Barnett <lists at l8r.net> wrote: > > > Hey all, > > drbd has been working fairly well for me until now, but something > bizarre happened this morning. > > My secondary was taken offline, for regular maintenance. Out of the > blue, the primary stopped serving files via NFS. I could ping the box, > but before I had a chance to login, the box was rebooted locally (since > it was not responding...) > > When our primary drbd box came back, drbd "locked" on boot. The > Debian /etc/rc2.d/S70drbd startup script would run, spawning drbdadm. > This would run, and just sit there for an unlimited period of time, as > if waiting for the secondary (or something else). > > I moved the initial rc2.d startup script, and tried all variety of drbd > commands after fresh reboots, in order to get a response of some sort. > I tried "drbdadm primary all" on this node, and so on, but drbdadm would > never return or provide any sort of error message upon start. > Typically, I would get something like: > > "drbd starting [d0 " > > At which point, the process would lock... and eventually (minutes.. 15 > or more at one point in time) just sit there. Commands given after this > happened, would timeout. > > Eventually, I tried to remove the secondary from the /etc/drbd.conf > config file, but this resulted in drbd failing to run at all. Returning > the lines, resulted in _two_ unconfigured lines appearing in my > /proc/drbd file when trying to restart drbd. > > Prior to this, I only had one. > > Eventually I had to move to the raw ext3 partition, to restore access > for my users. > > So, how could this have been avoided. Could anyone clue me into what I > did wrong? At one point in time, I did edit the drbd config file, and > set any timeout value therein, to 10 seconds. However, this was after I > edited the drbd.conf file, as mentioned above, and I had two nodes at > this point. > > Still, there must be a way to make a drbd partition primary.. no matter > what, regardless of any other circumstances. Otherwise, drbd seems very > risky to me. :( > > I'm going to use this downtime to move from raid5 to raid10. However, > I'm concerned about this conversion process, now. I am worried that I > will not be able to take my node, and get it to run in standalone mode, > so I can do my initial raid format, drbd prep, etc. That is, I want to: > > - take my secondary, prep the raid > - setup the raid as a drbd partition with a secondary configured, but > never connected > - prep the drbd partition, copy data from my ext3 (old drbd primary) > partition via rsync, setting up my new drbd partition > - make this live > - setup my old primary as secondary, do a full sync, and be on my way > with a new drbd setup > > However, I am very worried as to what will happen without a secondary > now! > > Any help / guidance is greatly appreciated, here! > > Thanks > > > > > > > > > > > > > > _______________________________________________ > drbd-user mailing list > drbd-user at lists.linbit.com > http://lists.linbit.com/mailman/listinfo/drbd-user