Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hello, there have been no functional problems with DRBD since we last upgraded the kernel due to recent nasty security holes in it, but after that, DRBD started to act disconnecting and connecting itself, everything being done in less than 1 minute. No catastrophic scenario either. It's not happening regularly, rather randomly. I just wonder why is this happening again, because in the past we had a similar issue with disconnection and connection of a DRBD resource, but it was caused due to using PCI network cards on a AMD machine with more than 3GB of RAM. So nothing abnormal, just my curiosity about the STOP_SYNC_TIMER message occurring for the first time within a bunch of "standard" messages in the kernel log. I'm attaching the whole log of the situation. Will try to upgrade to DRBD 0.7.21, maybe it will help. Thank you! Mirek On Thu, Aug 10, 2006 at 12:31:33PM +0200, Lars Ellenberg wrote: > / 2006-08-10 12:24:54 +0200 > \ Miroslav Jany: > > Hi folks, > > > > does anybody know what could these messages in kernel.log mean? > > well, yes. one of our ASSERTS triggerd. *g* > > > Aug 10 04:50:36 ferda kernel: drbd1: _drbd_rs_resume: > > (test_bit(STOP_SYNC_TIMER,&mdev->flags)) in /usr/src/wdt/drbd-0.7.20/drb > > d/drbd_worker.c:693 > > Aug 10 04:50:36 ferda kernel: drbd1: STOP_SYNC_TIMER was set in > > _drbd_rs_resume, but rs_left still 3 > > I think it is harmless: you probably got disconnected while you had some > syncgroup in PausedSync state... > > but, please give some more context. > what did you do, what happened, some more messages in syslog around that > time, do you have any functional problems with drbd, what did /proc/drbd > say after that... > > -- > : Lars Ellenberg Tel +43-1-8178292-0 : > : LINBIT Information Technologies GmbH Fax +43-1-8178292-82 : > : Schoenbrunner Str. 244, A-1120 Vienna/Europe http://www.linbit.com : > __ > please use the "List-Reply" function of your email client. > _______________________________________________ > drbd-user mailing list > drbd-user at lists.linbit.com > http://lists.linbit.com/mailman/listinfo/drbd-user -------------- next part -------------- Aug 10 04:50:22 ferda kernel: drbd2: PingAck did not arrive in time. Aug 10 04:50:22 ferda kernel: drbd2: drbd2_asender [23681]: cstate Connected --> NetworkFailure Aug 10 04:50:22 ferda kernel: drbd2: asender terminated Aug 10 04:50:22 ferda kernel: drbd2: drbd2_receiver [4626]: cstate NetworkFailure --> BrokenPipe Aug 10 04:50:22 ferda kernel: drbd2: short read expecting header on sock: r=-512 Aug 10 04:50:22 ferda kernel: drbd2: worker terminated Aug 10 04:50:22 ferda kernel: drbd2: drbd2_receiver [4626]: cstate BrokenPipe --> Unconnected Aug 10 04:50:22 ferda kernel: drbd2: Connection lost. Aug 10 04:50:22 ferda kernel: drbd2: drbd2_receiver [4626]: cstate Unconnected --> WFConnection Aug 10 04:50:30 ferda kernel: drbd1: PingAck did not arrive in time. Aug 10 04:50:30 ferda kernel: drbd1: drbd1_asender [27685]: cstate Connected --> NetworkFailure Aug 10 04:50:30 ferda kernel: drbd1: asender terminated Aug 10 04:50:30 ferda kernel: drbd1: drbd1_receiver [4618]: cstate NetworkFailure --> BrokenPipe Aug 10 04:50:30 ferda kernel: drbd1: short read expecting header on sock: r=-512 Aug 10 04:50:30 ferda kernel: drbd1: worker terminated Aug 10 04:50:30 ferda kernel: drbd1: drbd1_receiver [4618]: cstate BrokenPipe --> Unconnected Aug 10 04:50:30 ferda kernel: drbd1: Connection lost. Aug 10 04:50:30 ferda kernel: drbd1: drbd1_receiver [4618]: cstate Unconnected --> WFConnection Aug 10 04:50:30 ferda kernel: drbd0: PingAck did not arrive in time. Aug 10 04:50:30 ferda kernel: drbd0: drbd0_asender [23689]: cstate Connected --> NetworkFailure Aug 10 04:50:30 ferda kernel: drbd0: asender terminated Aug 10 04:50:30 ferda kernel: drbd0: drbd0_receiver [4610]: cstate NetworkFailure --> BrokenPipe Aug 10 04:50:30 ferda kernel: drbd0: short read expecting header on sock: r=-512 Aug 10 04:50:30 ferda kernel: drbd0: worker terminated Aug 10 04:50:30 ferda kernel: drbd0: drbd0_receiver [4610]: cstate BrokenPipe --> Unconnected Aug 10 04:50:30 ferda kernel: drbd0: Connection lost. Aug 10 04:50:30 ferda kernel: drbd0: drbd0_receiver [4610]: cstate Unconnected --> WFConnection Aug 10 04:50:33 ferda kernel: drbd0: drbd0_receiver [4610]: cstate WFConnection --> WFReportParams Aug 10 04:50:33 ferda kernel: drbd0: Handshake successful: DRBD Network Protocol version 74 Aug 10 04:50:33 ferda kernel: drbd0: Connection established. Aug 10 04:50:33 ferda kernel: drbd0: I am(P): 1:00000018:00000002:00000597:0000002a:10 Aug 10 04:50:33 ferda kernel: drbd0: Peer(S): 1:00000018:00000002:00000596:0000002a:01 Aug 10 04:50:33 ferda kernel: drbd0: drbd0_receiver [4610]: cstate WFReportParams --> WFBitMapS Aug 10 04:50:33 ferda kernel: drbd0: Primary/Unknown --> Primary/Secondary Aug 10 04:50:33 ferda kernel: drbd0: drbd0_receiver [4610]: cstate WFBitMapS --> SyncSource Aug 10 04:50:33 ferda kernel: drbd0: Resync started as SyncSource (need to sync 760 KB [190 bits set]). Aug 10 04:50:35 ferda kernel: drbd1: drbd1_receiver [4618]: cstate WFConnection --> WFReportParams Aug 10 04:50:35 ferda kernel: drbd1: Handshake successful: DRBD Network Protocol version 74 Aug 10 04:50:35 ferda kernel: drbd1: Connection established. Aug 10 04:50:35 ferda kernel: drbd1: I am(S): 1:00000017:00000001:000001be:00000013:01 Aug 10 04:50:35 ferda kernel: drbd1: Peer(P): 1:00000017:00000001:000001bf:00000013:10 Aug 10 04:50:35 ferda kernel: drbd1: drbd1_receiver [4618]: cstate WFReportParams --> WFBitMapT Aug 10 04:50:35 ferda kernel: drbd1: Secondary/Unknown --> Secondary/Primary Aug 10 04:50:35 ferda kernel: drbd2: drbd2_receiver [4626]: cstate WFConnection --> WFReportParams Aug 10 04:50:35 ferda kernel: drbd2: Handshake successful: DRBD Network Protocol version 74 Aug 10 04:50:35 ferda kernel: drbd2: Connection established. Aug 10 04:50:35 ferda kernel: drbd2: I am(P): 1:00000010:00000001:00000421:00000021:10 Aug 10 04:50:35 ferda kernel: drbd2: Peer(S): 1:00000010:00000001:00000420:00000021:01 Aug 10 04:50:35 ferda kernel: drbd2: drbd2_receiver [4626]: cstate WFReportParams --> WFBitMapS Aug 10 04:50:35 ferda kernel: drbd1: drbd1_receiver [4618]: cstate WFBitMapT --> SyncTarget Aug 10 04:50:35 ferda kernel: drbd1: Resync started as SyncTarget (need to sync 48 KB [12 bits set]). Aug 10 04:50:35 ferda kernel: drbd1: drbd1_receiver [4618]: cstate SyncTarget --> PausedSyncT Aug 10 04:50:35 ferda kernel: drbd1: Syncer waits for sync group. Aug 10 04:50:35 ferda kernel: drbd2: Primary/Unknown --> Primary/Secondary Aug 10 04:50:35 ferda kernel: drbd0: Resync done (total 2 sec; paused 0 sec; 380 K/sec) Aug 10 04:50:35 ferda kernel: drbd0: drbd0_worker [331]: cstate SyncSource --> Connected Aug 10 04:50:35 ferda kernel: drbd1: Syncer continues. Aug 10 04:50:35 ferda kernel: drbd1: drbd0_worker [331]: cstate PausedSyncT --> SyncTarget Aug 10 04:50:36 ferda kernel: drbd2: drbd2_receiver [4626]: cstate WFBitMapS --> SyncSource Aug 10 04:50:36 ferda kernel: drbd2: Resync started as SyncSource (need to sync 216 KB [54 bits set]). Aug 10 04:50:36 ferda kernel: drbd1: drbd2_receiver [4626]: cstate SyncTarget --> PausedSyncT Aug 10 04:50:36 ferda kernel: drbd1: Syncer waits for sync group. Aug 10 04:50:36 ferda kernel: drbd2: Resync done (total 1 sec; paused 0 sec; 216 K/sec) Aug 10 04:50:36 ferda kernel: drbd2: drbd2_worker [32737]: cstate SyncSource --> Connected Aug 10 04:50:36 ferda kernel: drbd1: Syncer continues. Aug 10 04:50:36 ferda kernel: drbd1: drbd2_worker [32737]: cstate PausedSyncT --> SyncTarget Aug 10 04:50:36 ferda kernel: drbd1: _drbd_rs_resume: (test_bit(STOP_SYNC_TIMER,&mdev->flags)) in /usr/src/wdt/drbd-0.7.20/drbd/drbd_worker.c:693 Aug 10 04:50:36 ferda kernel: drbd1: STOP_SYNC_TIMER was set in _drbd_rs_resume, but rs_left still 3