Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Please could someone knowledgeble tell me why this may happen? Under load (local massive quota update by a script plus cpio of a big tree from an NFS client) synchronization is repeatedly restarted after ~1% completion: Mar 23 11:21:11 nfsa1.mail.back kernel: drbd0: Connection established. size=214291003 KB / blksize=4096 B Mar 23 11:21:11 nfsa1.mail.back kernel: drbd0: Synchronisation started blks=15 Mar 23 11:22:58 nfsa2.mail.back kernel: drbd0: Connection lost. Mar 23 11:22:59 nfsa1.mail.back kernel: drbd0: Syncer aborted. Mar 23 11:22:59 nfsa1.mail.back kernel: drbd0: Connection lost. Mar 23 11:22:59 nfsa2.mail.back kernel: drbd0: Connection established. size=214291003 KB / blksize=4096 B Mar 23 11:22:59 nfsa1.mail.back kernel: drbd0: Connection established. size=214291003 KB / blksize=4096 B Mar 23 11:22:59 nfsa1.mail.back kernel: drbd0: Synchronisation started blks=15 Mar 23 11:23:50 nfsa2.mail.back kernel: drbd0: Connection lost. Mar 23 11:23:57 nfsa1.mail.back kernel: drbd0: Syncer aborted. Mar 23 11:23:57 nfsa1.mail.back kernel: drbd0: Connection lost. Mar 23 11:23:57 nfsa2.mail.back kernel: drbd0: Connection established. size=214291003 KB / blksize=4096 B Mar 23 11:23:57 nfsa1.mail.back kernel: drbd0: Connection established. size=214291003 KB / blksize=4096 B Mar 23 11:23:57 nfsa1.mail.back kernel: drbd0: Synchronisation started blks=15 Mar 23 11:24:54 nfsa2.mail.back kernel: drbd0: Connection lost. Mar 23 11:25:02 nfsa1.mail.back kernel: drbd0: Syncer aborted. Mar 23 11:25:02 nfsa1.mail.back kernel: drbd0: Connection lost. Am I right in assumption that when load average is high on the master, and "application" write speed is higher than sync-min, syncer running at low priority cannot keep pace and disconnects? Also, I was getting significant number of messages like these: Mar 22 16:29:27 nfsa1.mail.back kernel: drbd0: [kjournald/819] sock_sendmsg timeout count down: ko=4294967295 Mar 22 16:34:37 nfsa1.mail.back kernel: drbd0: pending_cnt <0 !!! Mar 22 16:44:02 nfsa1.mail.back kernel: drbd0: [kjournald/819] sock_sendmsg timeout count down: ko=4294967295 Mar 22 16:58:34 nfsa1.mail.back kernel: drbd0: [kjournald/819] sock_sendmsg timeout count down: ko=4294967295 Mar 22 17:08:26 nfsa1.mail.back kernel: drbd0: [kjournald/819] sock_sendmsg timeout count down: ko=4294967295 ... Mar 22 19:35:09 nfsa1.mail.back kernel: drbd0: [drbd_syncer_0/6603] sock_sendmsg timeout count down: ko=4294967295 What do they mean? The kernel is 2.4.25 (smp, on Xeon), drbd 0.6.10-cvs (now trying 0.6.12). Thanks Eugene -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20040323/bdd25b45/attachment.pgp>