[DRBD-user] DRBD + RHCS - Failover not working

Gianluca Cecchi gianluca.cecchi at gmail.com
Wed Dec 9 11:46:45 CET 2009

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


On Wed, Dec 9, 2009 at 2:02 AM, James Perry <jperry at mezeo.com> wrote:
[snip]
> Here's the error I'm getting... It appears that DRBD is failing but I can't tell why.
>
> Dec  7 14:34:49 rhcsnode1 clurgmgrd[8024]: <notice> Service service:mezeo_ha_db started
> Dec  7 14:36:36 rhcsnode1 clurgmgrd: [8024]: <err> script:pgsql_svc: status of /etc/rc.d/init.d/postgresql failed (returned 1)
> Dec  7 14:36:36 rhcsnode1 clurgmgrd[8024]: <notice> status on script "pgsql_svc" returned 1 (generic error)
> Dec  7 14:36:36 rhcsnode1 clurgmgrd[8024]: <notice> Stopping service service:mezeo_ha_db
> Dec  7 14:36:37 rhcsnode1 clurgmgrd: [8024]: <err> script:pgsql_svc: stop of /etc/rc.d/init.d/postgresql failed (returned 1)
> Dec  7 14:36:37 rhcsnode1 clurgmgrd[8024]: <notice> stop on script "pgsql_svc" returned 1 (generic error)

>From what you posted, one can only deduce that your
/etc/rc.d/init.d/postgresql script is perhaps not conforming with what
expected.
In fact clurgmgrd is not able to evaluate the result of postgresql status:
script:pgsql_svc: status of /etc/rc.d/init.d/postgresql failed (returned 1)

Does this depend on you killing postmaster process or other similar? I
don't think so...
On a test server with CentOS 5.4 and a clean postgresql-server
installed, even if I do a kill -9 of the postmaster pid, so that I
have the file /var/run/postmaster.5432.pid without the process itself,
a
service postgresql status gives
 [root at c54vm1 ~]# service postgresql status
postmaster is stopped
[root at c54vm1 ~]# echo $?
3

(see also /etc/rc.d/init.d/functions)

This should be returned to rhcs when a service is not running, AFAIK.

So, coming back to your system, clurgmgrd decides to stop the service,
because it is not able to evaluate it (again giving an error ...):
script:pgsql_svc: stop of /etc/rc.d/init.d/postgresql failed (returned 1)

Note also these:
The following rules apply to parent/child relationships in a resource tree:
• Parents are started before children.
• Children must all stop cleanly before a parent may be stopped.
• For a resource to be considered in good health, all its children
must be in good health.

HIH,
Gianluca

PS: you have the default resource provided by rhcs for postgresql in
resource section, but you are using standard postgresql init script in
service section as an external script... any reason?



More information about the drbd-user mailing list