Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
oki am going mad now..... i shutdown the primary and the secondary does nothing .... here are the logs, waz up with that? it was working before, so far this whole heartbeat 7 drbd thing is far from stable, its unpredictable, if it will work or not?? ANy help appreciated I killed the primary at 16:00 Jan 18 16:00:37 megs heartbeat: [6093]: WARN: node stewie: is dead Jan 18 16:00:37 megs heartbeat: [6093]: WARN: No STONITH device configured. Jan 18 16:00:37 megs heartbeat: [6093]: WARN: Shared disks are not protected. Jan 18 16:00:37 megs heartbeat: [6093]: info: Resources being acquired from stewie. Jan 18 16:00:37 megs heartbeat: [6093]: info: Link stewie:/dev/ttyS0 dead. Jan 18 16:00:37 megs heartbeat: [6600]: debug: notify_world: setting SIGCHLD Handler to SIG_DFL Jan 18 16:00:37 megs heartbeat: [6601]: info: No local resources [/usr/lib/heartbeat/ResourceManager listkeys megs] to acquire. Jan 18 16:00:37 megs heartbeat: [6093]: debug: StartNextRemoteRscReq(): child count 1 Jan 18 16:00:37 megs IPaddr[6817]: [6867]: INFO: /usr/lib/heartbeat/send_arp -i 500 -r 10 -p /var/run/heartbeat/rsctmp/send_arp/send_arp-192.168.5.199 eth1 192.168.5.199 auto 192.168.5.199 ffffffffffff Jan 18 16:00:38 megs mach_down[6616]: [7623]: info: /usr/lib/heartbeat/mach_down: nice_failback: foreign resources acquired Jan 18 16:00:38 megs heartbeat: [6093]: info: mach_down takeover complete. Jan 18 16:01:08 megs heartbeat: [6093]: info: megs wants to go standby [foreign] Jan 18 16:01:19 megs heartbeat: [6093]: WARN: No reply to standby request. Standby request cancelled. Jan 18 16:10:03 megs heartbeat: [6093]: info: Heartbeat shutdown in progress. (6093) Jan 18 16:10:03 megs heartbeat: [7848]: info: Giving up all HA resources. Jan 18 16:10:03 megs heartbeat: [7848]: info: All HA resources relinquished. Jan 18 16:10:04 megs heartbeat: [6093]: info: killing /usr/lib/heartbeat/ipfail process group 6109 with signal 15 Jan 18 16:10:06 megs heartbeat: [6093]: info: killing HBFIFO process 6096 with signal 15 Jan 18 16:10:06 megs heartbeat: [6093]: info: killing HBWRITE process 6097 with signal 15 Jan 18 16:10:06 megs heartbeat: [6093]: info: killing HBREAD process 6098 with signal 15 Jan 18 16:10:06 megs heartbeat: [6093]: info: killing HBWRITE process 6099 with signal 15 Jan 18 16:10:06 megs heartbeat: [6093]: info: killing HBREAD process 6100 with signal 15 Jan 18 16:10:06 megs heartbeat: [6093]: info: Core process 6098 exited. 5 remaining Jan 18 16:10:06 megs heartbeat: [6093]: info: Core process 6099 exited. 4 remaining Jan 18 16:10:06 megs heartbeat: [6093]: info: Core process 6100 exited. 3 remaining Jan 18 16:10:06 megs heartbeat: [6093]: info: Core process 6096 exited. 2 remaining Jan 18 16:10:06 megs heartbeat: [6093]: info: Core process 6097 exited. 1 remaining Jan 18 16:10:06 megs heartbeat: [6093]: info: megs Heartbeat shutdown complete. Jan 18 16:11:16 megs heartbeat: [8313]: info: Enabling logging daemon Jan 18 16:11:16 megs heartbeat: [8313]: info: logfile and debug file are those specified in logd config file (default /etc/logd.cf) Jan 18 16:11:16 megs heartbeat: [8313]: WARN: Core dumps could be lost if multiple dumps occur Jan 18 16:11:16 megs heartbeat: [8313]: WARN: Consider setting /proc/sys/kernel/core_uses_pid (or equivalent) to 1 for maximum supportability Jan 18 16:11:16 megs heartbeat: [8313]: WARN: logd is enabled but logfile/debugfile/logfacility is still configured in ha.cf Jan 18 16:11:16 megs heartbeat: [8313]: info: ************************** Jan 18 16:11:16 megs heartbeat: [8313]: info: Configuration validated. Starting heartbeat 2.0.7 Jan 18 16:11:16 megs heartbeat: [8314]: info: heartbeat: version 2.0.7 Jan 18 16:11:16 megs heartbeat: [8314]: info: Heartbeat generation: 15 Jan 18 16:11:16 megs heartbeat: [8314]: info: G_main_add_TriggerHandler: Added signal manual handler Jan 18 16:11:16 megs heartbeat: [8314]: info: G_main_add_TriggerHandler: Added signal manual handler Jan 18 16:11:16 megs heartbeat: [8314]: info: Removing /var/run/heartbeat/rsctmp failed, recreating. Jan 18 16:11:16 megs heartbeat: [8314]: info: glib: ping heartbeat started. Jan 18 16:11:16 megs heartbeat: [8314]: info: glib: Starting serial heartbeat on tty /dev/ttyS0 (19200 baud) Jan 18 16:11:16 megs heartbeat: [8314]: info: G_main_add_SignalHandler: Added signal handler for signal 17 Jan 18 16:11:16 megs heartbeat: [8314]: info: Local status now set to: 'up' Jan 18 16:11:17 megs heartbeat: [8314]: info: Link 192.168.5.1:192.168.5.1 up. Jan 18 16:11:17 megs heartbeat: [8314]: info: Status update for node 192.168.5.1: status ping Jan 18 16:13:16 megs heartbeat: [8314]: WARN: node stewie: is dead Jan 18 16:13:16 megs heartbeat: [8314]: info: Comm_now_up(): updating status to active Jan 18 16:13:16 megs heartbeat: [8314]: info: Local status now set to: 'active' Jan 18 16:13:16 megs heartbeat: [8314]: info: Starting child client "/usr/lib/heartbeat/ipfail" (108,110) Jan 18 16:13:16 megs heartbeat: [8314]: WARN: No STONITH device configured. Jan 18 16:13:16 megs heartbeat: [8314]: WARN: Shared disks are not protected. Jan 18 16:13:16 megs heartbeat: [8314]: info: Resources being acquired from stewie. Jan 18 16:13:16 megs heartbeat: [8325]: info: Starting "/usr/lib/heartbeat/ipfail" as uid 108 gid 110 (pid 8325) Jan 18 16:13:16 megs heartbeat: [8326]: debug: notify_world: setting SIGCHLD Handler to SIG_DFL Jan 18 16:13:16 megs ipfail: [8325]: debug: Signing in with heartbeat Jan 18 16:13:16 megs heartbeat: [8327]: info: No local resources [/usr/lib/heartbeat/ResourceManager listkeys megs] to acquire. Jan 18 16:13:16 megs heartbeat: [8314]: info: Initial resource acquisition complete (T_RESOURCES(us)) Jan 18 16:13:16 megs heartbeat: [8314]: debug: StartNextRemoteRscReq(): child count 1 Jan 18 16:13:17 megs IPaddr[8543]: [8593]: INFO: /usr/lib/heartbeat/send_arp -i 500 -r 10 -p /var/run/heartbeat/rsctmp/send_arp/send_arp-192.168.5.199 eth1 192.168.5.199 auto 192.168.5.199 ffffffffffff Jan 18 16:13:18 megs mach_down[8342]: [9349]: info: /usr/lib/heartbeat/mach_down: nice_failback: foreign resources acquired Jan 18 16:13:18 megs heartbeat: [8314]: info: mach_down takeover complete. Jan 18 16:13:26 megs heartbeat: [8314]: info: Local Resource acquisition completed. (none) Jan 18 16:13:26 megs heartbeat: [8314]: info: local resource transition completed. Jan 18 16:13:48 megs heartbeat: [8314]: info: megs wants to go standby [foreign] Jan 18 16:13:58 megs heartbeat: [8314]: WARN: No reply to standby request. Standby request cancelled. Rob Morin Dido Internet Inc. Montreal,Canada http://www.dido.ca 514-990-4444 Rob Morin wrote: > OK so i just read on a post that i am not suppose to have heartbeat > and my services start on boot up..... so i removed them from startup > via the "update-rc.d -f heartbeat remove" command on my Debian > system and then rebooted, everything came up???? > > Hhmmm can i not add heartbeat as an S99heartbeat on start up rather > than have to manually start it after boot? > > also i see this in my /proc/drbd n the primary > > version: 8.0.4 (api:86/proto:86) > SVN Revision: 2947 build by root at stewie, 2008-01-17 12:26:19 > 0: cs:StandAlone st:Primary/Unknown ds:UpToDate/DUnknown r--- > ns:0 nr:0 dw:40 dr:1549 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 > resync: used:0/31 hits:0 misses:0 starving:0 dirty:0 changed:0 > act_log: used:0/257 hits:10 misses:0 starving:0 dirty:0 changed:0 > > And on secondary... > > version: 8.0.4 (api:86/proto:86) > SVN Revision: 2947 build by root at megs, 2008-01-17 13:19:05 > 0: cs:StandAlone st:Secondary/Unknown ds:UpToDate/DUnknown r--- > ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 > resync: used:0/31 hits:0 misses:0 starving:0 dirty:0 changed:0 > act_log: used:0/257 hits:0 misses:0 starving:0 dirty:0 changed:0 > > > Does this mean drbd is not working correctly? > > > > > Rob Morin > Dido Internet Inc. > Montreal,Canada > http://www.dido.ca > 514-990-4444 > > > > Rob Morin wrote: >> Hello all, my first post here so be gentle.... >> >> I install DRBD 8 via Debian package on Debian Etch... >> >> i pretty much got everything working, except..... >> >> When i reboot the primary nothing comes up.... i get this in the >> heartbeat log.... The secondary is up all the time and does not have >> control of anything... >> Please let me know if more info or log file entries are needed.... >> >> conf and log files are below.... i removed comments to shorten post >> --------------------------------------------------------------------- >> Heartbeat log file..... >> >> /dev/drbd0: Wrong medium type >> INFO: Filesystem Success >> INFO: IPaddr Success >> >> >> drbd.conf file...... >> >> >> global { >> usage-count yes; >> } >> >> >> >> common { >> syncer { rate 10M; } >> } >> >> >> resource web { >> >> protocol C; >> >> handlers { >> pri-on-incon-degr "echo o > /proc/sysrq-trigger ; halt -f"; >> >> pri-lost-after-sb "echo o > /proc/sysrq-trigger ; halt -f"; >> >> local-io-error "echo o > /proc/sysrq-trigger ; halt -f"; >> >> outdate-peer "/usr/sbin/drbd-peer-outdater"; >> } >> >> startup { >> wfc-timeout 60; >> >> degr-wfc-timeout 120; # 2 minutes. >> } >> >> disk { >> on-io-error detach; >> >> } >> >> net { >> after-sb-0pri disconnect; >> >> after-sb-1pri disconnect; >> >> >> after-sb-2pri disconnect; >> >> rr-conflict disconnect; >> >> } >> >> syncer { >> rate 100M; >> >> } >> >> on stewie { >> device /dev/drbd0; >> disk /dev/md2; >> address 192.168.5.149:7788; >> flexible-meta-disk internal; >> >> } >> >> on megs { >> device /dev/drbd0; >> disk /dev/md2; >> address 192.168.5.151:7788; >> meta-disk internal; >> } >> } >> >> >> >> ha.cf conf file...... >> >> stewie:/etc# cat ha.d/ha.cf >> logfacility daemon # This is deprecated >> keepalive 1 # Interval between heartbeat (HB) packets. >> deadtime 10 # How quickly HB determines a dead node. >> warntime 5 # Time HB will issue a late HB. >> initdead 120 # Time delay needed by HB to report a >> dead node. >> udpport 694 # UDP port HB uses to communicate >> between nodes. >> ping 192.168.5.1 # Ping VMware Server host to simulate >> network resource. >> serial /dev/ttyS0 # Which interface to use for HB packets. >> auto_failback on # Auto promotion of primary node upon >> return to cluster. >> node stewie # Node name must be same as uname -r. >> node megs # Node name must be same as uname -r. >> >> respawn hacluster /usr/lib/heartbeat/ipfail >> # Specifies which programs to run at startup >> >> use_logd yes # Use system logging. >> logfile /var/log/hb.log # Heartbeat logfile. >> debugfile /var/log/heartbeat-debug.log # Debugging logfile. >> >> haresources file..... >> >> stewie IPaddr::192.168.5.199 drbddisk::web \ >> Filesystem::/dev/drbd0::/var/www::ext3::defaults apache2 >> --------------------------------------------------------------------------------------------------- >> >> >> Also which is supposed to start first drbd or heartbeat? as drbd >> starts first as per rc2.d >> >> lrwxrwxrwx 1 root root 14 2008-01-16 14:41 S70drbd -> ../init.d/drbd >> lrwxrwxrwx 1 root root 19 2008-01-16 15:14 S75heartbeat -> >> ../init.d/heartbeat >> >> A ps -ax shows this after the reboot, but nothing comes up the ip is >> not enabled and my /var/www i snot mounted >> >> 2957 ? S 0:00 [drbd0_worker] >> 2972 ? S 0:00 [drbd0_receiver] >> 2987 ? S 0:00 ha_logd: read process >> 2988 ? S 0:00 ha_logd: write process >> 3112 ? SLs 0:00 heartbeat: master control process >> 3119 ? Ss 0:00 /usr/sbin/atd >> 3126 ? Ss 0:00 /usr/sbin/cron >> 3139 ? SL 0:00 heartbeat: FIFO reader >> 3140 ? SL 0:00 heartbeat: write: ping 192.168.5.1 >> 3141 ? SL 0:00 heartbeat: read: ping 192.168.5.1 >> 3148 ? SL 0:00 heartbeat: write: serial /dev/ttyS0 >> 3149 ? SL 0:00 heartbeat: read: serial /dev/ttyS0 >> >> Thanks to all for your help..... >> > _______________________________________________ > drbd-user mailing list > drbd-user at lists.linbit.com > http://lists.linbit.com/mailman/listinfo/drbd-user