Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
You are not having trouble "unmounting drbddisk", you are having trouble unmounting the filesystem that sits on top of DRBD. Something is using that filesystem, and Heartbeat attempts to clean up by first issuing "fuser -mk <mountpoint>", and then "fuser -mk -SIGKILL <mountpoint>", but there are no processes left to signal. So you need to find our what else may be using your filesystem. This may, for example, be a loop device referencing a file on that filesystem (fuser has no way of finding those, as they're not associated with a userspace process), or some specific socket types (fuser is known to not detect those reliably). DRBD is not at fault here. You either have your Filesystem resource misconfigured, or some other process/application intervening, or (unlikely) a Filesystem RA bug is biting you. The folks over on the linux-ha mailing list may be able to help you out. Cheers, Florian Predatorz wrote: > Hi, > > When i rebooted my machine, heartbeat will have problems to unmount the drbd > device below are parts of the logs. > Machine is running on CentOS 5.2 with drbd 8.2.6 and heartbeat > 2.1.3-3.el5.centos > I am trying to do HA for firewall device. > DRBD device is a 10GB lvm2 disk formatted with ext3. > > My haresources file for the resources to be started, it is able to mount the > drbd device and start all the IPs properly, only problem is that when it try > to unmount when i issue reboot. > > eysihfw1 drbddisk::drbd0 Filesystem::/dev/drbd0::/replicated::ext3 > IPaddr::10.6.1.1/255.255.255.0 xx.xx.xx.xx/27/eth1:0 xx.xx.xx.xx/27/eth1:1 > xx.xx.xx.xx/27/eth1:2 xx.xx.xx.xx/27/eth1:3 xx.xx.xx.xx/27/eth1:4 > xx.xx.xx.xx/27/eth1:5 xx.xx.xx.xx/27/eth1:6 210.23.11.247/27/eth1:7 > xx.xx.xx.xx/27/eth1:8 xx.xx.xx.xx/27/eth1:9 xx.xx.xx.xx/27/eth1:10 > xx.xx.xx.xx/27/eth1:11 xx.xx.xx.xx/27/eth1:12 210.23.11.253/27/eth1:13 > 210.23.11.254/27/eth1:14 openvpn named > > Logs from ha-log file before heartbeat get killed ungracefully and reboot > the box. > > ResourceManager[8164]: 2008/07/01_12:22:37 ERROR: Return code 1 from > /etc/ha.d/resource.d/Filesystem > ResourceManager[8164]: 2008/07/01_12:22:38 info: Retrying failed stop > operation [Filesystem::/dev/drbd0::/replicated::ext3] > ResourceManager[8164]: 2008/07/01_12:22:38 info: Running > /etc/ha.d/resource.d/Filesystem /dev/drbd0 /replicated ext3 stop > Filesystem[11342]: 2008/07/01_12:22:38 INFO: Running stop for > /dev/drbd0 on /replicated > Filesystem[11342]: 2008/07/01_12:22:38 INFO: Trying to unmount > /replicated > Filesystem[11342]: 2008/07/01_12:22:38 ERROR: Couldn't unmount > /replicated; trying cleanup with SIGTERM > Filesystem[11342]: 2008/07/01_12:22:38 INFO: No processes on > /replicated were signalled > Filesystem[11342]: 2008/07/01_12:22:39 ERROR: Couldn't unmount > /replicated; trying cleanup with SIGTERM > Filesystem[11342]: 2008/07/01_12:22:39 INFO: No processes on > /replicated were signalled > Filesystem[11342]: 2008/07/01_12:22:40 ERROR: Couldn't unmount > /replicated; trying cleanup with SIGTERM > Filesystem[11342]: 2008/07/01_12:22:40 INFO: No processes on > /replicated were signalled > Filesystem[11342]: 2008/07/01_12:22:41 ERROR: Couldn't unmount > /replicated; trying cleanup with SIGKILL > Filesystem[11342]: 2008/07/01_12:22:41 INFO: No processes on > /replicated were signalled > Filesystem[11342]: 2008/07/01_12:22:42 ERROR: Couldn't unmount > /replicated; trying cleanup with SIGKILL > Filesystem[11342]: 2008/07/01_12:22:43 INFO: No processes on > /replicated were signalled > Filesystem[11342]: 2008/07/01_12:22:44 ERROR: Couldn't unmount > /replicated; trying cleanup with SIGKILL > Filesystem[11342]: 2008/07/01_12:22:44 INFO: No processes on > /replicated were signalled > Filesystem[11342]: 2008/07/01_12:22:45 ERROR: Couldn't unmount > /replicated, giving up! > Filesystem[11331]: 2008/07/01_12:22:45 ERROR: Generic error > ResourceManager[8164]: 2008/07/01_12:22:45 ERROR: Return code 1 from > /etc/ha.d/resource.d/Filesystem > ResourceManager[8164]: 2008/07/01_12:22:46 info: Retrying failed stop > operation [Filesystem::/dev/drbd0::/replicated::ext3] > [...] -- : Florian G. Haas : LINBIT Information Technologies GmbH : Vivenotgasse 48, A-1120 Vienna, Austria Enterprise consultancy and support for DRBD is available from LINBIT. If you are interested, Please go to http://www.linbit.com/en/contact and leave your contact details. When replying, there is no need to CC my personal address. I monitor the list on a daily basis. Thank you.