Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hey all, drbd has been working fairly well for me until now, but something bizarre happened this morning. My secondary was taken offline, for regular maintenance. Out of the blue, the primary stopped serving files via NFS. I could ping the box, but before I had a chance to login, the box was rebooted locally (since it was not responding...) When our primary drbd box came back, drbd "locked" on boot. The Debian /etc/rc2.d/S70drbd startup script would run, spawning drbdadm. This would run, and just sit there for an unlimited period of time, as if waiting for the secondary (or something else). I moved the initial rc2.d startup script, and tried all variety of drbd commands after fresh reboots, in order to get a response of some sort. I tried "drbdadm primary all" on this node, and so on, but drbdadm would never return or provide any sort of error message upon start. Typically, I would get something like: "drbd starting [d0 " At which point, the process would lock... and eventually (minutes.. 15 or more at one point in time) just sit there. Commands given after this happened, would timeout. Eventually, I tried to remove the secondary from the /etc/drbd.conf config file, but this resulted in drbd failing to run at all. Returning the lines, resulted in _two_ unconfigured lines appearing in my /proc/drbd file when trying to restart drbd. Prior to this, I only had one. Eventually I had to move to the raw ext3 partition, to restore access for my users. So, how could this have been avoided. Could anyone clue me into what I did wrong? At one point in time, I did edit the drbd config file, and set any timeout value therein, to 10 seconds. However, this was after I edited the drbd.conf file, as mentioned above, and I had two nodes at this point. Still, there must be a way to make a drbd partition primary.. no matter what, regardless of any other circumstances. Otherwise, drbd seems very risky to me. :( I'm going to use this downtime to move from raid5 to raid10. However, I'm concerned about this conversion process, now. I am worried that I will not be able to take my node, and get it to run in standalone mode, so I can do my initial raid format, drbd prep, etc. That is, I want to: - take my secondary, prep the raid - setup the raid as a drbd partition with a secondary configured, but never connected - prep the drbd partition, copy data from my ext3 (old drbd primary) partition via rsync, setting up my new drbd partition - make this live - setup my old primary as secondary, do a full sync, and be on my way with a new drbd setup However, I am very worried as to what will happen without a secondary now! Any help / guidance is greatly appreciated, here! Thanks