Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hello, My cluster had a hiccup today. Primary node was manually soft-rebooted by someone, and DRBD on secondary node was on a loop trying to start. I browsed the Filesystem RA, and i see it uses fuser to try and remove processes attached to the fs. Obviously this didn't work, as fuser does not return 0. So i ask, what kind of known or hipothetical situation could originate this problem? I believe i had a bash planted on the fs, but that souldn't be a problem for fuser -k. I'm also sending this to DRBD list, maybe it's something just DRBD related. Logs showed this when the cluster was shutting down on the primary node: Filesystem[10970]: 2007/12/27_09:52:27 INFO: Running stop for /dev/drbd0 on /drbd0 Filesystem[10970]: 2007/12/27_09:52:27 INFO: Trying to unmount /drbd0 lrmd[4900]: 2007/12/27_09:52:27 info: RA output: (fs0:stop:stderr) umount: /drbd0: device is busy umount: /drbd0: device is busy [... trying to umount several times with fuser and SIGTERM/KILL signals ...] Filesystem[10970]: 2007/12/27_09:52:32 ERROR: Couldn't unmount /drbd0; trying cleanup with SIGKILL Filesystem[10970]: 2007/12/27_09:52:32 INFO: No processes on /drbd0 were signalled Filesystem[10970]: 2007/12/27_09:52:33 ERROR: Couldn't unmount /drbd0, giving up! lrmd[4900]: 2007/12/27_09:52:34 WARN: Exiting fs0:stop process 10970 returned rc 1. crmd[4903]: 2007/12/27_09:52:34 ERROR: process_lrm_event: LRM operation fs0_stop_0 (call=83, rc=1) Error unknown error tia, r