Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
On 26 Apr 2006, Lars Ellenberg wrote: > if you can "reproduce" this scenario, > please try with current drbd-0.7 svn, > which should be released as 0.7.18 soonish. The same thing still happens with 0.7.18. Both nodes get disconnected for an unknown reason to me and can't reconnect. This time, it's less critical to me because the filesystem is still readable and writeable. Here's more information: Primary : May 8 13:34:31 mail1 kernel: drbd0: [kjournald/3038] sock_sendmsg time expired, ko = 4294967295 May 8 13:34:34 mail1 kernel: drbd0: [kjournald/3038] sock_sendmsg time expired, ko = 4294967294 May 8 13:34:46 mail1 kernel: drbd0: [kjournald/3038] sock_sendmsg time expired, ko = 4294967295 May 8 13:34:49 mail1 kernel: drbd0: [kjournald/3038] sock_sendmsg time expired, ko = 4294967294 May 8 13:34:52 mail1 kernel: drbd0: PingAck did not arrive in time. May 8 13:34:52 mail1 kernel: drbd0: drbd0_asender [811]: cstate Connected --> NetworkFailure May 8 13:34:52 mail1 kernel: drbd0: asender terminated May 8 13:34:52 mail1 kernel: drbd0: kjournald [3038]: cstate NetworkFailure --> Timeout May 8 13:34:52 mail1 kernel: drbd0: drbd0_receiver [2493]: cstate Timeout --> BrokenPipe May 8 13:34:52 mail1 kernel: drbd0: short read expecting header on sock: r=-512 May 8 13:34:52 mail1 kernel: drbd0: short sent UnplugRemote size=8 sent=-1001 May 8 13:34:52 mail1 kernel: drbd0: worker terminated May 8 13:34:52 mail1 kernel: drbd0: drbd0_receiver [2493]: cstate BrokenPipe --> Unconnected May 8 13:34:52 mail1 kernel: drbd0: Connection lost. May 8 13:34:52 mail1 kernel: drbd0: drbd0_receiver [2493]: cstate Unconnected --> WFConnection May 8 13:34:55 mail1 kernel: drbd0: drbd0_receiver [2493]: cstate WFConnection --> WFReportParams root at mail1:~# cat /proc/drbd version: 0.7.18 (api:78/proto:74) SVN Revision: 2176 build by root at ns2, 2006-04-26 20:05:56 0: cs:WFReportParams st:Primary/Unknown ld:Consistent ns:38632 nr:0 dw:109369656 dr:25406553 al:365924 bm:986 lo:2 pe:0 ua:0 ap:0 Secondary: May 8 13:34:52 mail2 kernel: drbd0: meta connection shut down by peer. May 8 13:34:52 mail2 kernel: drbd0: drbd0_asender [7479]: cstate Connected --> NetworkFailure May 8 13:34:52 mail2 kernel: drbd0: asender terminated May 8 13:34:52 mail2 kernel: drbd0: drbd0_receiver [30218]: cstate NetworkFailure --> BrokenPipe May 8 13:34:52 mail2 kernel: drbd0: short read receiving data block: read 3984 expected 4096 May 8 13:34:52 mail2 kernel: drbd0: error receiving Data, l: 4112! May 8 13:34:52 mail2 kernel: drbd0: worker terminated May 8 13:34:52 mail2 kernel: drbd0: drbd0_receiver [30218]: cstate BrokenPipe --> Unconnected May 8 13:34:52 mail2 kernel: drbd0: Connection lost. May 8 13:34:52 mail2 kernel: drbd0: drbd0_receiver [30218]: cstate Unconnected --> WFConnection May 8 13:34:55 mail2 kernel: drbd0: drbd0_receiver [30218]: cstate WFConnection --> WFReportParams May 8 13:34:57 mail2 kernel: drbd0: sock_recvmsg returned -11 May 8 13:34:57 mail2 kernel: drbd0: drbd0_receiver [30218]: cstate WFReportParams --> BrokenPipe May 8 13:34:57 mail2 kernel: drbd0: short read expecting header on sock: r=-11 May 8 13:34:57 mail2 kernel: drbd0: My msock connect got accepted onto peer's sock! May 8 13:35:03 mail2 kernel: drbd0: worker terminated May 8 13:35:03 mail2 kernel: drbd0: drbd0_receiver [30218]: cstate BrokenPipe --> Unconnected May 8 13:35:03 mail2 kernel: drbd0: Connection lost. May 8 13:35:03 mail2 kernel: drbd0: drbd0_receiver [30218]: cstate Unconnected --> WFConnection When I tried to restart DRBD on the secondary, I got: May 8 13:53:47 mail2 kernel: drbd: initialised. Version: 0.7.18 (api:78/proto:74) May 8 13:53:47 mail2 kernel: drbd: SVN Revision: 2176 build by root at ns2, 2006-04-26 20:05:56 May 8 13:53:47 mail2 kernel: drbd: registered as block device major 147 May 8 13:53:47 mail2 kernel: drbd0: resync bitmap: bits=8658397 words=270576 May 8 13:53:47 mail2 kernel: drbd0: size = 33 GB (34633588 KB) May 8 13:53:47 mail2 kernel: klogd 1.4.1, ---------- state change ---------- May 8 13:53:47 mail2 kernel: Loaded 1352 symbols from 22 modules. May 8 13:53:48 mail2 kernel: drbd0: 0 KB marked out-of-sync by on disk bit-map. May 8 13:53:48 mail2 kernel: drbd0: Found 4 transactions (136 active extents) in activity log. May 8 13:53:48 mail2 kernel: drbd0: drbdsetup [445]: cstate Unconfigured --> StandAlone May 8 13:53:48 mail2 kernel: drbd0: drbdsetup [461]: cstate StandAlone --> Unconnected May 8 13:53:48 mail2 kernel: drbd0: drbd0_receiver [462]: cstate Unconnected --> WFConnection May 8 13:54:08 mail2 kernel: drbd0: drbdsetup [479]: cstate WFConnection --> Unconnected May 8 13:54:08 mail2 kernel: drbd0: worker terminated May 8 13:54:08 mail2 kernel: drbd0: drbd0_receiver [462]: cstate Unconnected --> StandAlone May 8 13:54:08 mail2 kernel: drbd0: Connection lost. May 8 13:54:08 mail2 kernel: drbd0: Discarding network configuration. May 8 13:54:08 mail2 kernel: drbd0: drbd0_receiver [462]: cstate StandAlone --> StandAlone May 8 13:54:08 mail2 kernel: drbd0: receiver terminated May 8 13:54:08 mail2 kernel: drbd0: drbdsetup [479]: cstate StandAlone --> StandAlone May 8 13:54:08 mail2 kernel: drbd0: drbdsetup [479]: cstate StandAlone --> Unconfigured May 8 13:54:08 mail2 kernel: drbd0: worker terminated May 8 13:54:08 mail2 kernel: drbd: module cleanup done. and absolutely nothing happens on the primary. When I tried "drbdadm connect all" on the primary, I got: root at mail1:~# drbdadm connect all Child process does not terminate! Exiting. root at mail1:~# May 8 14:01:06 mail1 kernel: drbd0: interrupted during initial handshake May 8 14:01:06 mail1 kernel: drbd0: My msock connect got accepted onto peer's sock! May 8 14:01:06 mail1 kernel: drbd0: worker terminated and absolutely nothing happens on the secondary. I'm running 2.4.27-2-686 (Debian flavor, no patches) on both nodes. -- Cyril Bouthors -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 188 bytes Desc: not available URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20060508/443f3300/attachment.pgp>