Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hello,
I'm quite new to RHCS and DRBD so please bare with me.
My problem is that when the RHCS service (mezeo_ha_db) starts, the filesystem is not mounted.
I have:
2 node
CentOS 5.3 with RHCS and DRBD (8.3) from CentOS repo.
I think I have DRBD working. I can issue drbdadm primary <resource> and mount the filesystem, unmount, set to secondary and repeat on second node without issue. Here's my drbd.conf:
global {
usage-count yes;
}
common {
protocol C;
}
resource drbd_disk {
on rhcsnode1 {
device /dev/drbd0;
disk /dev/hdc1;
address 10.10.10.100:7789;
meta-disk internal;
}
on rhcsnode2 {
device /dev/drbd0;
disk /dev/hdc1;
address 10.10.10.101:7789;
meta-disk internal;
}
}
I've added this resource to a service in RHCS but when I start the service the filesystem is not mounted. Also, if I relocate the service, the service re-locates but nothing changes at the filesystem level. Here's my cluster config.
<?xml version="1.0"?>
<cluster alias="pgsql_cluster" config_version="36" name="pgsql_cluster">
<fence_daemon clean_start="0" post_fail_delay="0" post_join_delay="3"/>
<clusternodes>
<clusternode name="rhcsnode1.localdomain" nodeid="2" votes="1">
<fence/>
</clusternode>
<clusternode name="rhcsnode2.localdomain" nodeid="3" votes="1">
<fence/>
</clusternode>
</clusternodes>
<cman expected_votes="1" two_node="1"/>
<fencedevices/>
<rm>
<failoverdomains>
<failoverdomain name="fo_domain" nofailback="0" ordered="0" restricted="0"/>
</failoverdomains>
<resources>
<ip address="10.10.10.150" monitor_link="1"/>
<postgres-8 config_file="/var/lib/pgsql/data/postgresql.conf" name="pgsql_db" postmaster_user="postgres" shutdown_wait="0"/>
<drbd name="res_drbd" resource="drbd_disk">
<fs device="/dev/drbd/by-res/drbd_disk" fstype="ext3" mountpoint="/var/lib/pgsql/data" name="fs_pgsql" options="noatime"/>
</drbd>
</resources>
<service autostart="1" exclusive="0" name="mezeo_ha_db" recovery="relocate">
<drbd ref="res_drbd"/>
</service>
</rm>
</cluster>
Here's the output from /var/log/messages:
Dec 3 14:20:27 rhcsnode1 luci[2515]: Unable to retrieve batch 979902234 status from rhcsnode1.localdomain:11111: module scheduled for execution
Dec 3 14:20:27 rhcsnode1 kernel: block drbd0: peer( Primary -> Secondary )
Dec 3 14:20:27 rhcsnode1 clurgmgrd[2572]: <notice> Starting stopped service service:mezeo_ha_db
Dec 3 14:20:27 rhcsnode1 kernel: block drbd0: role( Secondary -> Primary )
Dec 3 14:20:28 rhcsnode1 clurgmgrd[2572]: <notice> Service service:mezeo_ha_db started
Anyone's help is greatly appreciated.
Thx