Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
On Tuesday 07 June 2005 16:11, Lars Ellenberg wrote: > Note that all processes waiting for disk io are counted as runable! > Therefore, if a lot of processes wait for disk io, the "load average" goes > straight up, though the system actually may be almost idle cpu-wise ... > > E.g. crash your nfs server It likely may have been NFS-related, though I didn't know and couldn't figure out at the time. Stopped everything that was running and the load still sat at 4, which was just worrisome. > bad ram? motherboard? southbridge or whatever? Motherboard most likely. Not RAM. > drbdadm reconnect all > should have worked (on the one that was StandAlone ). Thanks, I'll remember that one. > but you can then > drbdadm invalidate > on the one with the bad data (probably the current Secondary), Okay. Where is proper documentation on this? I haven't been able to find much besides quick guides for setting things up, nothing really comprehensive. > well. you should have done so before, and more importantly set the > known BAD server to "inconsistent", so it will receive a full sync... How? > maybe some oddities in imap/maildir/symlink/header cache or some such? No. The individual mail files were not present on the system when we first brought up services. We looked in several maildir directories individually, and the files were simply not there. When the files started re-appearing, they didn't come all at once, but just started showing up here and there. One of our clients had called complaining about missing mail and we told him that we would check our backups in case it was more recent than the old DRBD data (it wasn't), and he later called back to say thanks, that he saw his missed mail showing up in his mailbox one by one. Looking at the filesystem, the files that were there, that were later simply not there, were simply there again. > or maybe it was just a meteorite shower, cosmic rays, you know :-> I'm pretty certain that somehow or another DRBD ended up working out the problem, because the files had *never been stored* on this drive since it was out of sync, and the initial recovery sync did not copy these files over. > btw, you are sure the hardware is ok (again) ? We're sure that the *current* hardware is ok. As described, the other server is bad. Currently there is only one machine (we're installing a replacement for the secondary tonight). Cheers, -- Casey Allen Shobe | http://casey.shobe.info cshobe at seattleserver.com | cell 425-443-4653 AIM & Yahoo: SomeLinuxGuy | ICQ: 1494523 SeattleServer.com, Inc. | http://www.seattleserver.com