[DRBD-user] Syncer aborted. Connection lost.

Eugene Crosser crosser at rol.ru
Tue Mar 23 09:49:14 CET 2004


Please could someone knowledgeble tell me why this may happen?
Under load (local massive quota update by a script plus cpio of a big
tree from an NFS client) synchronization is repeatedly restarted after
~1% completion:

Mar 23 11:21:11 nfsa1.mail.back kernel: drbd0: Connection established. size=214291003 KB / blksize=4096 B
Mar 23 11:21:11 nfsa1.mail.back kernel: drbd0: Synchronisation started blks=15
Mar 23 11:22:58 nfsa2.mail.back kernel: drbd0: Connection lost.
Mar 23 11:22:59 nfsa1.mail.back kernel: drbd0: Syncer aborted.
Mar 23 11:22:59 nfsa1.mail.back kernel: drbd0: Connection lost.
Mar 23 11:22:59 nfsa2.mail.back kernel: drbd0: Connection established. size=214291003 KB / blksize=4096 B
Mar 23 11:22:59 nfsa1.mail.back kernel: drbd0: Connection established. size=214291003 KB / blksize=4096 B
Mar 23 11:22:59 nfsa1.mail.back kernel: drbd0: Synchronisation started blks=15
Mar 23 11:23:50 nfsa2.mail.back kernel: drbd0: Connection lost.
Mar 23 11:23:57 nfsa1.mail.back kernel: drbd0: Syncer aborted.
Mar 23 11:23:57 nfsa1.mail.back kernel: drbd0: Connection lost.
Mar 23 11:23:57 nfsa2.mail.back kernel: drbd0: Connection established. size=214291003 KB / blksize=4096 B
Mar 23 11:23:57 nfsa1.mail.back kernel: drbd0: Connection established. size=214291003 KB / blksize=4096 B
Mar 23 11:23:57 nfsa1.mail.back kernel: drbd0: Synchronisation started blks=15
Mar 23 11:24:54 nfsa2.mail.back kernel: drbd0: Connection lost.
Mar 23 11:25:02 nfsa1.mail.back kernel: drbd0: Syncer aborted.
Mar 23 11:25:02 nfsa1.mail.back kernel: drbd0: Connection lost.

Am I right in assumption that when load average is high on the master,
and "application" write speed is higher than sync-min, syncer running at
low priority cannot keep pace and disconnects?

Also, I was getting significant number of messages like these:

Mar 22 16:29:27 nfsa1.mail.back kernel: drbd0: [kjournald/819] sock_sendmsg timeout count down: ko=4294967295
Mar 22 16:34:37 nfsa1.mail.back kernel: drbd0: pending_cnt <0 !!!
Mar 22 16:44:02 nfsa1.mail.back kernel: drbd0: [kjournald/819] sock_sendmsg timeout count down: ko=4294967295
Mar 22 16:58:34 nfsa1.mail.back kernel: drbd0: [kjournald/819] sock_sendmsg timeout count down: ko=4294967295
Mar 22 17:08:26 nfsa1.mail.back kernel: drbd0: [kjournald/819] sock_sendmsg timeout count down: ko=4294967295
...
Mar 22 19:35:09 nfsa1.mail.back kernel: drbd0: [drbd_syncer_0/6603] sock_sendmsg timeout count down: ko=4294967295

What do they mean?
The kernel is 2.4.25 (smp, on Xeon), drbd 0.6.10-cvs (now trying
0.6.12).

Thanks
Eugene
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : http://lists.linbit.com/pipermail/drbd-user/attachments/20040323/bdd25b45/attachment.pgp 


More information about the drbd-user mailing list