[DRBD-user] Problems with applications on top of DRBD locking up, becoming defunct

Lars Ellenberg lars.ellenberg at linbit.com
Wed Aug 29 23:52:52 CEST 2007

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


On Wed, Aug 29, 2007 at 07:30:51PM +0200, Lavender, Ben wrote:
> I'm working on getting a few DRBD servers up and running, but I'm having
> a ton of trouble with it.  I've set up DRBD/heartbeat on some smaller
> servers before with no problems, but this current batch is behaving
> strangely with no error messages reported anywhere.
> 
> Basically, any application (so far pound, nfsd, mysql, and ldap, along
> with a number of filesystem tools) will eventually lock up while working
> with the mounted drbd volume.  It can happen with something as simple as
> 'ls', or 'touch'.  After whatever process locks up, the drbd volume is
> considered busy and can't be released without fencing it.  After a
> reboot, this happens in less than 3 hours.  Killing whatever process is
> having the problem will do nothing without -9; with -9 they will
> sometimes become defunct and sometimes do nothing.
> 
> /proc/drbd shows nothing unusual.  This is an example after my second
> two drbd volumes have locked up, but I managed to unmount and secondary
> one of them.  The second one is locked with mysql and the third with
> umount.
> 
>  
> 
> [root at backend1 ben]# cat /proc/drbd
> 
> version: 8.0.5 (api:86/proto:86)
....

> These are the same machines that prompted my 'WFBitMapT and WFBitMapS'
> message from earlier this week which has not seen a response, namely,
> Dell 2950's with hardware RAID (Perc5's on the megaraid driver), x86_64,
> RHEL5, SELinux enabled.  The DRBD volumes live on top of LVM partitions.
> Kernel is 2.6.18-8.1.8, DRBD is 8.0.5, all mentioned software is either
> the CentOS5 or RHEL5 version of the particular package.

your working installations use a different kernel, I assume?
with this particulare kernel it locks up?

vendor kernels are known to backport or introduce things early.
it may well be the same problem that shows in kernel.org 2.6.22 and later,
a lockup in recursive generic_make_request.

please try again with a recent checkout of drbd from
 svn co http://svn.drbd.org/drbd/branches/drbd-8.0

-- 
: Lars Ellenberg                            Tel +43-1-8178292-0  :
: LINBIT Information Technologies GmbH      Fax +43-1-8178292-82 :
: Vivenotgasse 48, A-1120 Vienna/Europe    http://www.linbit.com :
__
please use the "List-Reply" function of your email client.



More information about the drbd-user mailing list