[DRBD-user] Re: Blocking DRBD

Todd Denniston Todd.Denniston at ssa.crane.navy.mil
Tue Feb 13 18:48:57 CET 2007

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Cassiano Surek wrote:
> Hi mate,
> 
> Have you ever gotten around the following DRBD quirk? That's your post in
> the list...
> Thanks!
> Cass
> 

Nope,
I have had to resort to adding an heartbeat target, which if it sees drbd as 
still active and it was called with stop, it tells the APC UPS, power down in 
60 seconds and then issues a `shutdown -h now`, at least that way the other 
system can take over. i.e., not a good solution.

> Lars Ellenberg wrote:
>>
>> > Also as Tom said "What information should we supply, debug information,
> to
>> > help you debug the problem", and how do we trap the data, the next time
> it
>> > happens?
>>
> <SNIP>
>> if "it" happens the next time, i.e.
>> you think drbd should become secondary, but it refuses with
>> "somebody has still opened me for write access" *or* something like
>> that, and neither fuser nor lsof can tell you who.
>>
> 
> I did a fallover tonight so I could update the machine, and tried to 
> capture
> 
> you a little info.
> 
> Sorry, I missed catching if it was write *or* read access.
> 
> when I issued `service heartbeat stop` I got the following in the log:
> all the expected services shutting down
> ...
> Nov 18 17:35:19 foo xinetd[3153]: Reconfigured: new=0 old=4 dropped=0
> (services)
> Nov 18 17:35:23 foo kernel: lockd: couldn't shutdown host module!
> Nov 18 17:35:23 foo kernel: nfsd: last server has exited
> Nov 18 17:35:23 foo kernel: nfsd: unexporting all filesystems
> Nov 18 17:35:23 foo nfs: nfsd shutdown succeeded
> Nov 18 17:35:23 foo nfs: rpc.rquotad shutdown succeeded
> Nov 18 17:35:23 foo nfs: Shutting down NFS services: succeeded
> Nov 18 17:35:23 foo rpc.statd[4734]: Caught signal 15, un-registering and
> exiting.
> Nov 18 17:35:23 foo nfslock: rpc.statd shutdown succeeded
> ...
> Nov 18 17:35:26 foo datadisk: ===> datadisk devnb1 stop <===
> Nov 18 17:35:26 foo datadisk: 'devnb1' /dev/nb1 is mounted on /devnb1,
> trying to unmount
> Nov 18 17:35:26 foo datadisk: umount -v /dev/nb1
> Nov 18 17:35:26 foo datadisk: ERROR: umount -v /dev/nb1 [1]:
> Nov 18 17:35:26 foo datadisk: ERROR: umount: /devnb1: *device* is *busy*
> Nov 18 17:35:26 foo datadisk: 'devnb1' trying to kill users of /dev/nb1
> Nov 18 17:35:26 foo datadisk: fuser -k -m /dev/nb1
> Nov 18 17:35:26 foo datadisk: ERROR: fuser -k -m /dev/nb1 [1]:
> Nov 18 17:35:26 foo datadisk: ERROR: NO OUTPUT
> Nov 18 17:35:29 foo datadisk: umount -v /dev/nb1
> ... rinse and repeat the errors and commands.
> 
> 
> fuser -a -v -k -m /dev/nb1
> showed no processes.
> 
> 
>> try to reduce the process list.
>> have a look at it: something in there that somewhen in its lifetime
>> might have accessed the *device*?
> 
> I killed (service ... stop) everything, but syslog,
> klog, login and all the [k*] (kernel???) processes.
> 
>> if yes: kill it, if possible.
>> does drbd still refuse to become secondary?
>> repeat.
> 
> still when I issued `umount /devnb1` it would fail to unmount.
> 
> I ran lsmod, and `modprobe -r`ed anything that I new I did not need to keep
> the disks & keyboard running, this included the modules nfsd & lockd**.
> 
> still when I issued `umount /devnb1` it would fail to unmount, so I could
> never push it to secondary.
> 
> I finaly did a `umount -r /devnb1`
> then a `umount -l /devnb1`,
> and issued `drbdsetup /dev/nb1 seconary `
> but it still *failed* to become secondary.
> 
> after `shutdown -h now` and power down, I made the other machine primary on
> /dev/nb1 and did a e2fsck, but it said the *device* was clean (which was
> good,
> I really did not want to wait the 2 hours for the fsck).
> 
> **I don't think that the nfsd & lockd and lockd modules should have been
> running by that point because their services were shutdown a long time
> previous. The lockd message on heartbeat stop and the lockd module still in
> the kernel were the only strange things I noticed.
> 


-- 
Todd Denniston
Crane Division, Naval Surface Warfare Center (NSWC Crane)
Harnessing the Power of Technology for the Warfighter



More information about the drbd-user mailing list