Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hi Lars As ever, the perfect answer. Thanks for your help. We will see how we get on. Regards, Ben On 2013-01-28 10:41, Lars Ellenberg wrote: > On Mon, Jan 28, 2013 at 09:31:31AM +0000, Ben Clewett wrote: >> Hi guys, >> >> We have a failure which hits us every few weeks on just one server. >> We suspect hardware issue on the network card. But it's proving >> hard to tie down. This is the failure and I would be interested in >> the opinion of this group. >> >> Error: >> >> [1580483.649257] block drbd0: magic?? on data m: 0x0 c: 0 l: 0 > > > Each DRBD network packet starts with a DRBD specific header. > That header contains a "magic" number, a "command" id, > and a payload "length". > > All three of them are apparently zeroed out. > So yes, that pretty much looks like your network path > somehow managed to zero out at least the start of a packet. > > > The asserts below are "boring", and the code has since been fixed to no > longer trigger those. > >> [1580483.649269] block drbd0: ASSERT FAILED cstate = Connected, >> expected < WFConnection >> [1580483.649286] block drbd0: ASSERT( mdev->state.conn < C_CONNECTED >> ) in >> /usr/src/packages/BUILD/drbd-8.3.4/obj/default/drbd_receiver.c:4500 >> [1580483.649295] block drbd0: asender terminated >> [1580483.649301] block drbd0: Terminating asender thread >> [1580483.649384] block drbd0: Connection closed >> [1580483.649390] block drbd0: peer( Primary -> Unknown ) conn( >> Connected -> Unconnected ) pdsk( UpToDate -> DUnknown ) >> [1580483.649396] block drbd0: receiver terminated >> [1580483.649399] block drbd0: Terminating receiver thread >> >> /proc/drbd >> version: 8.3.4 (api:88/proto:86-91) > > I recommend to upgrade to 8.3.15, > enable "data integrity checksumming", > run an online-verify, > and see where that gets you. >