Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
At random times (under high load possibly, although i can't say for certain) the drbd partition falls into read-only mode. The software that accesses the data on the partition starts reporting errors like: May 1 21:24:47 [kernel] Aborting journal on device drbd0. May 1 21:24:47 [lmtpunix] IOERROR: appending index records for user.joe: Input/output error May 1 21:24:47 [kernel] EXT3-fs error (device drbd0) in add_dirent_to_buf: Journal has aborted May 2 01:24:47 [postfix/postdrop] warning: mail_queue_enter: create file maildrop/225423.27996: Read-only file system May 1 21:24:47 [kernel] EXT3-fs error (device drbd0) in start_transaction: Journal has aborted - Last output repeated 4 times - May 1 21:24:47 [lmtpunix] IOERROR: creating quota file /export/cyrus/ imap/quota/j/user.joe.NEW: Read-only file system May 1 21:24:47 [kernel] EXT3-fs error (device drbd0) in start_transaction: Journal has aborted May 1 21:24:47 [lmtpunix] DBERROR: error storing user.joe: cyrusdb error May 1 21:24:47 [lmtpunix] LOSTQUOTA: unable to record use of 3029 bytes in quota file user.joe May 1 21:24:47 [lmtpunix] IOERROR: error unlinking file /export/ cyrus/spool/imap/stage./27887-1114997080-0: Read-only file system May 1 21:24:47 [kernel] EXT3-fs error (device drbd0) in start_transaction: Journal has aborted May 1 21:24:47 [postfix/local] fatal: update queue file active/ 0/0CED617BC073: Read-only file system May 1 21:24:47 [kernel] EXT3-fs error (device drbd0) in start_transaction: Journal has aborted The only solution is to reboot the system and let the other cluster twin take over. Does anyone have any idea what going on? This is causing enormous file corruption issues for us. Thanks, Lee