[drbd-mc] Help : Heartbeat failover not working

Alan Robertson alanr at unix.sh
Thu Aug 4 16:18:43 CEST 2011


On 08/03/2011 09:12 AM, vijay patel wrote:
> Hi,
>
> I have configured DRBD and Heartbeat on two nodes for High 
> Availability of NFS.
>
> [root at PdCuLx0501p ~]# rpm -qa | grep drbd
> drbd82-8.2.6-1.el5.centos
> kmod-drbd82-8.2.6-2
>
>
> [root at PdCuLx0501p ~]# rpm -qa | grep heartbeat
> heartbeat-pils-2.1.4-11.el5
> heartbeat-ldirectord-2.1.4-11.el5
> heartbeat-stonith-2.1.4-11.el5
> heartbeat-devel-2.1.4-11.el5
> heartbeat-2.1.4-11.el5
> heartbeat-gui-2.1.4-11.el5
>
>
> [root at PdCuLx0501p ~]# cat /proc/drbd
> version: 8.2.6 (api:88/proto:86-88)
> GIT-hash: 3e69822d3bb4920a8c1bfdf7d647169eba7d2eb4 build by 
> buildsvn at c5-x8664-build <mailto:buildsvn at c5-x8664-build>, 2008-10-03 
> 11:30:17
>  0: cs:Connected st:Primary/Secondary ds:UpToDate/UpToDate B r---
>     ns:1453512252 nr:1076780 dw:1454589032 dr:4588973 al:57316104 
> bm:686 lo:0 pe:0 ua:0 ap:0 oos:0
> [root at PdCuLx0501p ~]#
>
>
> drbd status is connected and in sync. Now when i stop heartbeat 
> service on primary server to test failover my primary server gets 
> rebooted. I can see secondary server acquiring virtual ip and mounting 
> the common share but on client i am getting error "permission denied. 
> /mnt/nfs is readonly".
>
> Here is my configuration files:
>
> [root at PdCuLx0501p ~]# cat /etc/drbd.conf
> global {
>     usage-count ask;
> }
>
> common {
>   syncer { rate 250M; }
> }
> resource r0 {
>         protocol B;
>         #incon-degr-cmd "halt -f";
>         startup {
>                 degr-wfc-timeout 120;    # 2 minutes.
>         }
>         disk {
>                 on-io-error   detach;
>         }
>         net {
>         }
>         syncer {
>                 rate 250M;
>                 #group 1;
>                 al-extents 257;
>         }
>         on PdCuLx0501p {
>                 device          /dev/drbd0;
>                 meta-disk       /dev/sdb2[0];
>                 disk            /dev/sdb3;
>                 address         10.153.80.213:7788;
>         }
>         on PdCuLx0502s {
>                 device          /dev/drbd0;
>                 meta-disk       /dev/sdb2[0];
>                 disk            /dev/sdb3;
>                 address         10.153.80.214:7788;
>         }
> }
>
>
> [root at PdCuLx0501p ~]# cat /etc/ha.d/ha.cf
> logfacility local0
> keepalive 2
> deadtime 30
> bcast eth0
> node PdCuLx0501p
> node PdCuLx0502s
> auto_failback on
>
>
> [root at PdCuLx0501p ~]# cat /etc/ha.d/haresources
> PdCuLx0501p IPaddr::10.153.80.215/24/eth0 drbddisk::r0 
> Filesystem::/dev/drbd0::/nfs4exports::ext3 nfslock nfs
>
>
> [root at PdCuLx0501p ~]# cat /etc/ha.d/authkeys
> auth 3
> 3 md5 50f4a0bd87aedb051e93b6aa16f1433e
>
>
> Now what is the problem in my configuration? Why i am not able to 
> mount files system from client when seconday has acquired primary 
> status and why primary gets restarted on stopping heartbeat service?

Have you tried stopping NFS and unmounting by hand?

I have seen recent NFS bugs where it can't unmount it after you stop NFS 
(NFS doesn't release all its references to the filesystem).

With pacemaker, and STONITH configured, then that will cause the system 
to reboot.  But, for old-style haresources, I don't know what would 
cause that.

But I would do this test by hand and see what result you get.

-- 
     Alan Robertson<alanr at unix.sh>

"Openness is the foundation and preservative of friendship...  Let me claim from you at all times your undisguised opinions." - William Wilberforce

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/drbd-mc/attachments/20110804/37f307a3/attachment.htm>


More information about the drbd-mc mailing list