Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hi,
On 11/15/2012 09:02 PM, Phillips, Dan wrote:
> Problem:
>
> The problem is that when performing an HA failover from server A to
> server B, a DRBD resource is sometimes not shut down properly on server
> A. Several attempts are made to stop the DRBD resource, but finally it
> gives up and the server is rebooted. The failover to server B works
> properly; B becomes the Active server. After the reboot, server A comes
> up properly as the Standby server.
>
> The problem is intermittent. Most HA failovers work as expected (server
> A does not reboot).
>
> When the problem does occur, the following lines are logged in
> /var/log/messages and displayed on the OOBM:
>
> drbd0: State change failed: Device is held open by someone
> drbd0: state = { cs:Connected st:Primary/Secondary
> ds:UpToDate/UpToDate r--- }
> drbd0: wanted = { cs:Connected st:Secondary/Secondary
> ds:UpToDate/UpToDate r--- }
alright, that's bad enough. The base problem is that pacemaker does not
(or cannot) make sure that the DRBD is relinquished by the
services/processes using it (or the kernel itself).
What is this DRBD used for? Is there a filesystem in it? Or is it used
by an iSCSI target? Something else entirely?
Cheers,
Felix