Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
You are missing a "drbddisk" entry in your haresources file. You didn't post your drbd.conf, but if you resource name for /dev/drbd0 is called "abc", then your haresources file should look like the following. ha1.example.net 192.168.29.184/28/eth0/192.168.29.255 drbddisk::abc Filesystem::/dev/drbd0::/data1::ext3 Michael Haverkamp drbd-user-bounces at linbit.com wrote on 07/26/2006 04:10:21 PM: > Hello All, > > I have figured out 90% of how to configure a 2 node drbd/heartbeat > cluster. No matter what I do, I can't seem to get the primary node to > recognize /dev/drbd0 as primary when I reboot both nodes in the cluster. > As a result, heartbeat fails the resource when it attempts to run the > "Filesystem" script. They both show up as secondary. Can someone provide > some insight into a way to ensure that my primary node in the cluster will > always mark the drbd0 device on that system as primary (provided there are > no failures or errors)? > > Thanks, > > Darren > > Here is my system info: > > ha1 - primary node > ha2 - secondary node > OS - RHEL4 > DRBD - 0.7.20 > Heartbeat - 2.0.6 > > I am trying to mount the /data1 directory to /dev/drbd0 using heartbeat. > When I reboot the primary node, I get nothing: > > ha1# df -k > Filesystem 1K-blocks Used Available Use% Mounted on > /dev/sda1 10317828 765640 9028072 8% / > none 1037372 0 1037372 0% /dev/shm > > I have to manually force the primary: > > ha1# drbdadm primary all > ha1# /etc/init.d/heartbeat stop > ha1# /etc/init.d/heartbeat start > ha1# df -k > # df -k > Filesystem 1K-blocks Used Available Use% Mounted on > /dev/sda1 10317828 765640 9028072 8% / > none 1037372 0 1037372 0% /dev/shm > /dev/drbd0 57914656 800952 54171804 2% /data1 > > I have moved the drbd rc script to run ahead of heartbeat by naming it > S40drbd and heartbeat S99heartbeat. I have also triple checked to make > sure that all the drbd, ha, and init script are identical on both systems. > > I believe I interpreted the timeout parameters correctly in the > /etc/drbd.conf. My understanding is that a positive number here will force > the node to become primary: > > startup { > wfc-timeout 1; > # degr-wfc-timeout 120; # 2 minutes. > } > > When I reboot the primary and secondary nodes (5 seconds apart from each > other), I receive the following info in dmesg on the primary node, stating > that both nodes are in secondary: > > ha1# dmesg | grep drbd > drbd: initialised. Version: 0.7.20 (api:79/proto:74) > drbd: SVN Revision: 2260 build by root at ha2.strongmail.net, 2006-07-21 > 16:12:22 > drbd: registered as block device major 147 > drbd0: resync bitmap: bits=14709516 words=459674 > drbd0: size = 56 GB (58838062 KB) > drbd0: 0 KB marked out-of-sync by on disk bit-map. > drbd0: Found 6 transactions (64 active extents) in activity log. > drbd0: drbdsetup [3052]: cstate Unconfigured --> StandAlone > drbd0: drbdsetup [3065]: cstate StandAlone --> Unconnected > drbd0: drbd0_receiver [3066]: cstate Unconnected --> WFConnection > drbd0: drbd0_receiver [3066]: cstate WFConnection --> WFReportParams > drbd0: Handshake successful: DRBD Network Protocol version 74 > drbd0: Connection established. > drbd0: I am(S): 1:00000002:00000001:00000013:00000001:00 > drbd0: Peer(S): 1:00000002:00000001:00000013:00000001:00 > drbd0: drbd0_receiver [3066]: cstate WFReportParams --> Connected > drbd0: Secondary/Unknown --> Secondary/Secondary > > This is further confirmed by heartbeat: > > ha1# tail -100 /var/log/ha-log > <snip> > > ResourceManager[3510]: 2006/07/26_06:58:10 info: Running > /etc/ha.d/resource.d/Filesystem /dev/drbd0 /data1 ext3 start > Filesystem[3997]: 2006/07/26_06:58:10 INFO: Running start for > /dev/drbd0 on /data1 > Filesystem[3997]: 2006/07/26_06:58:10 ERROR: Couldn't mount > filesystem /dev/drbd0 on /data1 > Filesystem[3933]: 2006/07/26_06:58:10 ERROR: Filesystem Generic error > ResourceManager[3510]: 2006/07/26_06:58:10 ERROR: Return code 1 from > /etc/ha.d/resource.d/Filesystem > ResourceManager[3510]: 2006/07/26_06:58:10 CRIT: Giving up resources due > to failure of Filesystem::/dev/drbd0::/data1::ext3 > > My /etc/fstab file has the correct entry on both ha1 and ha2 nodes: > > /dev/drbd0 /data1 ext3 noauto 0 0 > > Here is the /etc/ha.d/haresources file: > > ha1.example.net 192.168.29.184/28/eth0/192.168.29.255 > Filesystem::/dev/drbd0::/data1::ext3 > > > _______________________________________________ > drbd-user mailing list > drbd-user at lists.linbit.com > http://lists.linbit.com/mailman/listinfo/drbd-user