Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hello, all. I am quite ne to drbd, just set up first pair of machines and encounered the folowing strange behaviour: For several hours everythings works fine, but then mathines lost each other. Messages in syslog are: On primary machine: May 30 21:47:37 esd1 kernel: drbd0: meta connection shut down by peer. May 30 21:47:37 esd1 kernel: drbd0: short read expecting header on sock: r=0 May 30 21:47:39 esd1 kernel: drbd0: tl_clear() May 30 21:47:58 esd1 kernel: drbd0: sock_sendmsg returned -32 May 30 21:47:58 esd1 kernel: drbd0: short sent ReportBitMap size=4096 sent=0 May 30 21:47:59 esd1 kernel: drbd0: ASSERT( mdev->state.conn < Connected ) in /home/Mick/drbd-8.2.5/drbd/drbd_receiver.c:2838 May 30 21:47:59 esd1 kernel: drbd0: sock_sendmsg returned -32 May 30 21:47:59 esd1 kernel: drbd0: short sent ReportState size=12 sent=0 May 30 21:47:59 esd1 kernel: drbd0: meta connection shut down by peer. May 30 21:47:59 esd1 kernel: drbd0: sock_sendmsg returned -32 May 30 21:47:59 esd1 kernel: drbd0: short sent ReportBitMap size=4096 sent=0 May 30 21:48:01 esd1 kernel: drbd0: tl_clear() May 30 21:48:26 esd1 kernel: drbd0: sock_sendmsg returned -32 May 30 21:48:26 esd1 kernel: drbd0: short sent ReportBitMap size=4096 sent=0 May 30 21:48:26 esd1 kernel: drbd0: ASSERT( mdev->state.conn < Connected ) in /home/Mick/drbd-8.2.5/drbd/drbd_receiver.c:2838 May 30 21:48:26 esd1 kernel: drbd0: sock_sendmsg returned -32 May 30 21:48:26 esd1 kernel: drbd0: short sent ReportState size=12 sent=0 May 30 21:48:26 esd1 kernel: drbd0: meta connection shut down by peer. May 30 21:48:27 esd1 kernel: drbd0: sock_sendmsg returned -32 May 30 21:48:27 esd1 kernel: drbd0: short sent ReportBitMap size=4096 sent=0 May 30 21:48:28 esd1 kernel: drbd0: tl_clear() May 30 21:51:56 esd1 kernel: drbd0: short read expecting header on sock: r=0 May 30 21:51:56 esd1 kernel: drbd0: tl_clear() May 30 21:52:22 esd1 kernel: drbd0: PingAck did not arrive in time. May 30 21:52:29 esd1 kernel: drbd0: error receiving ReportBitMap, l: 4088! May 30 21:52:29 esd1 kernel: drbd0: tl_clear() May 30 22:05:09 esd1 kernel: drbd0: short read expecting header on sock: r=0 May 30 22:05:09 esd1 kernel: drbd0: tl_clear() May 30 22:19:06 esd1 kernel: drbd0: short read expecting header on sock: r=0 May 30 22:19:07 esd1 kernel: drbd0: meta connection shut down by peer. May 30 22:19:07 esd1 kernel: drbd0: tl_clear() May 30 22:19:13 esd1 kernel: drbd0: sock_sendmsg returned -32 May 30 22:19:13 esd1 kernel: drbd0: Authentication of peer failed May 30 22:19:13 esd1 kernel: drbd0: Discarding network configuration. May 30 22:19:13 esd1 kernel: drbd0: tl_clear() On secondary: May 30 21:48:04 esd2 kernel: drbd0: Error receiving initial packet May 30 21:48:07 esd2 kernel: drbd0: Error receiving initial packet May 30 21:48:25 esd2 kernel: drbd0: PingAck did not arrive in time. May 30 21:48:25 esd2 kernel: drbd0: error receiving ReportBitMap, l: 4088! May 30 21:48:25 esd2 kernel: drbd0: tl_clear() May 30 21:48:31 esd2 kernel: drbd0: Error receiving initial packet May 30 21:51:56 esd2 kernel: drbd0: PingAck did not arrive in time. May 30 21:51:56 esd2 kernel: drbd0: short read receiving data: read 2888 expected 4096 May 30 21:51:56 esd2 kernel: drbd0: error receiving Data, l: 4120! May 30 21:51:56 esd2 kernel: drbd0: tl_clear() May 30 21:52:08 esd2 kernel: drbd0: PingAck did not arrive in time. May 30 21:52:29 esd2 kernel: drbd0: ASSERT( mdev->state.conn < Connected ) in /home/Mick/drbd-8.2.5/drbd/drbd_receiver.c:2838 May 30 21:52:29 esd2 kernel: drbd0: md_sync_timer expired! Worker calls drbd_md_sync(). May 30 21:52:29 esd2 kernel: drbd0: tl_clear() May 30 22:05:08 esd2 kernel: drbd0: PingAck did not arrive in time. May 30 22:05:08 esd2 kernel: drbd0: short read expecting header on sock: r=-512 May 30 22:05:08 esd2 kernel: drbd0: tl_clear() May 30 22:19:06 esd2 kernel: drbd0: PingAck did not arrive in time. May 30 22:19:06 esd2 kernel: drbd0: short read expecting header on sock: r=-512 May 30 22:19:06 esd2 kernel: drbd0: tl_clear() May 30 22:19:12 esd2 kernel: drbd0: sock_recvmsg returned -11 May 30 22:19:12 esd2 kernel: drbd0: short read expecting header on sock: r=-11 May 30 22:19:12 esd2 kernel: drbd0: Authentication of peer failed May 30 22:19:12 esd2 kernel: drbd0: Discarding network configuration. May 30 22:19:12 esd2 kernel: drbd0: tl_clear() Both machines run fedora core 6, kernel 2.6.22.14-72.fc6, drbd-8.2.5, compiled from sources. The only idea I have is some problem with NIC, but other processes sharing the same NIC do not indicate any errors. Googling did not give me an answer, if it matters. Any ideas?