[DRBD-user] drbd turned on me, and bit my hand this morn
lists at l8r.net
Tue Dec 6 17:56:52 CET 2005
Just to add to this.
When drbdadm is in this "locked" state, I can not kill it, or kill -9 it.
This means that I must reboot the box, instead of rmmod the module and
make my configuration changes.
Because of this, I do not think this is the expected behaviour.
I'm using 0.7.10-3 (The latest Debian stable version), and found the same
behaviour with 0.7.14 freshly installed from tarballs.
On Tue, 6 Dec 2005 10:46:30 -0500
Brad Barnett <lists at l8r.net> wrote:
> Hey all,
> drbd has been working fairly well for me until now, but something
> bizarre happened this morning.
> My secondary was taken offline, for regular maintenance. Out of the
> blue, the primary stopped serving files via NFS. I could ping the box,
> but before I had a chance to login, the box was rebooted locally (since
> it was not responding...)
> When our primary drbd box came back, drbd "locked" on boot. The
> Debian /etc/rc2.d/S70drbd startup script would run, spawning drbdadm.
> This would run, and just sit there for an unlimited period of time, as
> if waiting for the secondary (or something else).
> I moved the initial rc2.d startup script, and tried all variety of drbd
> commands after fresh reboots, in order to get a response of some sort.
> I tried "drbdadm primary all" on this node, and so on, but drbdadm would
> never return or provide any sort of error message upon start.
> Typically, I would get something like:
> "drbd starting [d0 "
> At which point, the process would lock... and eventually (minutes.. 15
> or more at one point in time) just sit there. Commands given after this
> happened, would timeout.
> Eventually, I tried to remove the secondary from the /etc/drbd.conf
> config file, but this resulted in drbd failing to run at all. Returning
> the lines, resulted in _two_ unconfigured lines appearing in my
> /proc/drbd file when trying to restart drbd.
> Prior to this, I only had one.
> Eventually I had to move to the raw ext3 partition, to restore access
> for my users.
> So, how could this have been avoided. Could anyone clue me into what I
> did wrong? At one point in time, I did edit the drbd config file, and
> set any timeout value therein, to 10 seconds. However, this was after I
> edited the drbd.conf file, as mentioned above, and I had two nodes at
> this point.
> Still, there must be a way to make a drbd partition primary.. no matter
> what, regardless of any other circumstances. Otherwise, drbd seems very
> risky to me. :(
> I'm going to use this downtime to move from raid5 to raid10. However,
> I'm concerned about this conversion process, now. I am worried that I
> will not be able to take my node, and get it to run in standalone mode,
> so I can do my initial raid format, drbd prep, etc. That is, I want to:
> - take my secondary, prep the raid
> - setup the raid as a drbd partition with a secondary configured, but
> never connected
> - prep the drbd partition, copy data from my ext3 (old drbd primary)
> partition via rsync, setting up my new drbd partition
> - make this live
> - setup my old primary as secondary, do a full sync, and be on my way
> with a new drbd setup
> However, I am very worried as to what will happen without a secondary
> Any help / guidance is greatly appreciated, here!
> drbd-user mailing list
> drbd-user at lists.linbit.com
More information about the drbd-user