Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
george young wrote: > > What happened: One of a pair of servers (pig-app) hung (after several > months uptime), and had to be quickly rebooted. > > What went wrong: On rebooting, pig-app insisted on waiting the 1.7 hours > for the "db" partition to sync, even though pig-app does not mount "db". > > The configuration: two servers, pig-app and pig-db. Normally pig-app > mounts /home through drbd and pig-db mounts /db through drbd. The two > file systems are mirrored on the other server, so if one dies, the other > can take over services. The problem is that when pig-app rebooted, it > should (I think) have come up fully and synced it's copy of /db in the > background, not held up the boot process (and kept my users waiting!). > > Is my configuration wrong? <SNIP> Why were your users waiting? pig-db could have (should have?) taken over pig-app's work (via heartbeat configuration) until pig-app was fully ready to come back on line. At worst your users should have seen the system working slowly (so they wait a few seconds) not a full work stoppage. Granted I have mine only setup as CVS and NFS servers but when there is a fault on one (that actually causes drbd to panic the kernel, as I instructed it to do) generally people don't even notice, there is a 30-50 second burble of no activity and then the systems continue to function[1]. [1] with two exceptions: 1: any cvs commands in operation a the time have to be restarted, no big deal. 2: we have a Red Hat 6.2 machine which does NOT like to be in runlevel 3 (or higher) while a fall over is happening, on that box I have to issue `telinit 2` wait for fall-over to complete and then issue `telinit 3`. -- Todd Denniston Crane Division, Naval Surface Warfare Center (NSWC Crane) Harnessing the Power of Technology for the Warfighter