[DRBD-user] drbd connection dying badly, ever-rising load, requiring hard machine reset

Lars Ellenberg lars.ellenberg at linbit.com
Tue Nov 8 14:53:47 CET 2016

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


On Tue, Nov 08, 2016 at 12:02:51PM +0100, Christoph Lechleitner wrote:
> Am 2016-09-30 um 16:08 schrieb Christoph Lechleitner:
> >Hi everyone!
> >
> >Sorry for my lengthy mail, but I think I have to include some background ...
> >
> >[...]
> >
> >Starting last July a very nasty problem came up every now and then, 7
> >times so far, with no immediate pattern regarding hardware model or so:
> >
> >First one virtualized guest, presumably during an I/O peak like
> >rsync-over-ssh of a large directory, becomes unreachable and unsusable.
> >
> >Simultaneously the system load (i.e. the first number in /proc/loadavg)
> >starts to rise, slowly (about +1 every 3-5 minutes) but forever, to 1000
> >(in words: one thousand) and more.
> >
> >[...]
> >
> >However, it's
> >- impossible to disconnect the hanging drbd device
> >- impossible to kill related processes like drbdXX_submit, jbd2/drbdXX-8
> >- impossible to stop the fallen one or any other virtual machine
> >- hence impossible to do a clean shutdown or reboot
> >
> >The only way out is to press the reset button, either physically on site
> >or virtually using BMC/KVM/IPMI services.
> >
> >While I'm not entirely sure DRBD is to blame, 6 of 7 cases started with
> >weird drbd related messages in syslog.
> 
> Thanks mailinglist for ignoring me ;-)
>
> We finally got professional help from Richard Weinberger, and with only one
> problem occurance's full log (Thanks to netconsole) he found an actual and
> severe bug in DRBD8.
> 
> He just posted the patch on the drbd-dev list.

Well, thanks :-)

> We also integrated the patch in our debian-packaging of Linbit's drbd8
> 8.4.9-1 tarball, see bottom of https://confluence.clazzes.org/x/CgC2
> ("Probable Solution"), so we and other Debian jessie users (and maybe some
> users of Debian derivates) can install the patched module right away.
> 
> Regards,
> 
> Christoph Lechleitner

-- 
: Lars Ellenberg
: LINBIT | Keeping the Digital World Running
: DRBD -- Heartbeat -- Corosync -- Pacemaker

DRBD® and LINBIT® are registered trademarks of LINBIT
__
please don't Cc me, but send to list -- I'm subscribed



More information about the drbd-user mailing list