[DRBD-user] Re: Unable to make DRBD Resource Secondary

Todd Denniston Todd.Denniston at ssa.crane.navy.mil
Sat Nov 19 02:03:03 CET 2005

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Lars Ellenberg wrote:
> 
> > Also as Tom said "What information should we supply, debug information, to
> > help you debug the problem", and how do we trap the data, the next time it
> > happens?
> 
<SNIP>
>   if "it" happens the next time, i.e.
>   you think drbd should become secondary, but it refuses with
>   "somebody has still opened me for write access" or something like
>   that, and neither fuser nor lsof can tell you who.
> 

I did a fallover tonight so I could update the machine, and tried to capture
you a little info.

Sorry, I missed catching if it was write or read access.

when I issued `service heartbeat stop` I got the following in the log:
all the expected services shutting down 
...
Nov 18 17:35:19 foo xinetd[3153]: Reconfigured: new=0 old=4 dropped=0
(services)
Nov 18 17:35:23 foo kernel: lockd: couldn't shutdown host module!
Nov 18 17:35:23 foo kernel: nfsd: last server has exited
Nov 18 17:35:23 foo kernel: nfsd: unexporting all filesystems
Nov 18 17:35:23 foo nfs: nfsd shutdown succeeded
Nov 18 17:35:23 foo nfs: rpc.rquotad shutdown succeeded
Nov 18 17:35:23 foo nfs: Shutting down NFS services:  succeeded
Nov 18 17:35:23 foo rpc.statd[4734]: Caught signal 15, un-registering and
exiting.
Nov 18 17:35:23 foo nfslock: rpc.statd shutdown succeeded
...
Nov 18 17:35:26 foo datadisk: ===> datadisk devnb1 stop <===
Nov 18 17:35:26 foo datadisk: 'devnb1' /dev/nb1 is mounted on /devnb1,
trying to unmount
Nov 18 17:35:26 foo datadisk: umount -v /dev/nb1
Nov 18 17:35:26 foo datadisk: ERROR: umount -v /dev/nb1 [1]:
Nov 18 17:35:26 foo datadisk: ERROR: umount: /devnb1: device is busy
Nov 18 17:35:26 foo datadisk: 'devnb1' trying to kill users of /dev/nb1
Nov 18 17:35:26 foo datadisk: fuser -k -m /dev/nb1
Nov 18 17:35:26 foo datadisk: ERROR: fuser -k -m /dev/nb1 [1]:
Nov 18 17:35:26 foo datadisk: ERROR: NO OUTPUT
Nov 18 17:35:29 foo datadisk: umount -v /dev/nb1
... rinse and repeat the errors and commands.


fuser -a -v -k -m /dev/nb1
showed no processes.


> try to reduce the process list.
> have a look at it: something in there that somewhen in its lifetime
> might have accessed the device?

I killed (service ... stop) everything, but syslog, 
klog, login and all the [k*] (kernel???) processes.

> if yes: kill it, if possible.
> does drbd still refuse to become secondary?
> repeat.

still when I issued `umount /devnb1` it would fail to unmount.

I ran lsmod, and `modprobe -r`ed anything that I new I did not need to keep
the disks & keyboard running, this included the modules nfsd & lockd**.

still when I issued `umount /devnb1` it would fail to unmount, so I could
never push it to secondary.

I finaly did a `umount -r /devnb1` 
then a `umount -l /devnb1`, 
and issued `drbdsetup /dev/nb1 seconary `
but it still failed to become secondary.

after `shutdown -h now` and power down, I made the other machine primary on
/dev/nb1 and did a e2fsck, but it said the device was clean (which was good,
I really did not want to wait the 2 hours for the fsck).

**I don't think that the nfsd & lockd and lockd modules should have been
running by that point because their services were shutdown a long time
previous. The lockd message on heartbeat stop and the lockd module still in
the kernel were the only strange things I noticed.

-- 
Todd Denniston
Crane Division, Naval Surface Warfare Center (NSWC Crane) 
Harnessing the Power of Technology for the Warfighter



More information about the drbd-user mailing list