Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
oki am going mad now..... i shutdown the primary and the secondary
does nothing .... here are the logs, waz up with that?
it was working before, so far this whole heartbeat 7 drbd thing is far
from stable, its unpredictable, if it will work or not??
ANy help appreciated
I killed the primary at 16:00
Jan 18 16:00:37 megs heartbeat: [6093]: WARN: node stewie: is dead
Jan 18 16:00:37 megs heartbeat: [6093]: WARN: No STONITH device configured.
Jan 18 16:00:37 megs heartbeat: [6093]: WARN: Shared disks are not
protected.
Jan 18 16:00:37 megs heartbeat: [6093]: info: Resources being acquired
from stewie.
Jan 18 16:00:37 megs heartbeat: [6093]: info: Link stewie:/dev/ttyS0 dead.
Jan 18 16:00:37 megs heartbeat: [6600]: debug: notify_world: setting
SIGCHLD Handler to SIG_DFL
Jan 18 16:00:37 megs heartbeat: [6601]: info: No local resources
[/usr/lib/heartbeat/ResourceManager listkeys megs] to acquire.
Jan 18 16:00:37 megs heartbeat: [6093]: debug: StartNextRemoteRscReq():
child count 1
Jan 18 16:00:37 megs IPaddr[6817]: [6867]: INFO:
/usr/lib/heartbeat/send_arp -i 500 -r 10 -p
/var/run/heartbeat/rsctmp/send_arp/send_arp-192.168.5.199 eth1
192.168.5.199 auto 192.168.5.199 ffffffffffff
Jan 18 16:00:38 megs mach_down[6616]: [7623]: info:
/usr/lib/heartbeat/mach_down: nice_failback: foreign resources acquired
Jan 18 16:00:38 megs heartbeat: [6093]: info: mach_down takeover complete.
Jan 18 16:01:08 megs heartbeat: [6093]: info: megs wants to go standby
[foreign]
Jan 18 16:01:19 megs heartbeat: [6093]: WARN: No reply to standby
request. Standby request cancelled.
Jan 18 16:10:03 megs heartbeat: [6093]: info: Heartbeat shutdown in
progress. (6093)
Jan 18 16:10:03 megs heartbeat: [7848]: info: Giving up all HA resources.
Jan 18 16:10:03 megs heartbeat: [7848]: info: All HA resources relinquished.
Jan 18 16:10:04 megs heartbeat: [6093]: info: killing
/usr/lib/heartbeat/ipfail process group 6109 with signal 15
Jan 18 16:10:06 megs heartbeat: [6093]: info: killing HBFIFO process
6096 with signal 15
Jan 18 16:10:06 megs heartbeat: [6093]: info: killing HBWRITE process
6097 with signal 15
Jan 18 16:10:06 megs heartbeat: [6093]: info: killing HBREAD process
6098 with signal 15
Jan 18 16:10:06 megs heartbeat: [6093]: info: killing HBWRITE process
6099 with signal 15
Jan 18 16:10:06 megs heartbeat: [6093]: info: killing HBREAD process
6100 with signal 15
Jan 18 16:10:06 megs heartbeat: [6093]: info: Core process 6098 exited.
5 remaining
Jan 18 16:10:06 megs heartbeat: [6093]: info: Core process 6099 exited.
4 remaining
Jan 18 16:10:06 megs heartbeat: [6093]: info: Core process 6100 exited.
3 remaining
Jan 18 16:10:06 megs heartbeat: [6093]: info: Core process 6096 exited.
2 remaining
Jan 18 16:10:06 megs heartbeat: [6093]: info: Core process 6097 exited.
1 remaining
Jan 18 16:10:06 megs heartbeat: [6093]: info: megs Heartbeat shutdown
complete.
Jan 18 16:11:16 megs heartbeat: [8313]: info: Enabling logging daemon
Jan 18 16:11:16 megs heartbeat: [8313]: info: logfile and debug file are
those specified in logd config file (default /etc/logd.cf)
Jan 18 16:11:16 megs heartbeat: [8313]: WARN: Core dumps could be lost
if multiple dumps occur
Jan 18 16:11:16 megs heartbeat: [8313]: WARN: Consider setting
/proc/sys/kernel/core_uses_pid (or equivalent) to 1 for maximum
supportability
Jan 18 16:11:16 megs heartbeat: [8313]: WARN: logd is enabled but
logfile/debugfile/logfacility is still configured in ha.cf
Jan 18 16:11:16 megs heartbeat: [8313]: info: **************************
Jan 18 16:11:16 megs heartbeat: [8313]: info: Configuration validated.
Starting heartbeat 2.0.7
Jan 18 16:11:16 megs heartbeat: [8314]: info: heartbeat: version 2.0.7
Jan 18 16:11:16 megs heartbeat: [8314]: info: Heartbeat generation: 15
Jan 18 16:11:16 megs heartbeat: [8314]: info: G_main_add_TriggerHandler:
Added signal manual handler
Jan 18 16:11:16 megs heartbeat: [8314]: info: G_main_add_TriggerHandler:
Added signal manual handler
Jan 18 16:11:16 megs heartbeat: [8314]: info: Removing
/var/run/heartbeat/rsctmp failed, recreating.
Jan 18 16:11:16 megs heartbeat: [8314]: info: glib: ping heartbeat started.
Jan 18 16:11:16 megs heartbeat: [8314]: info: glib: Starting serial
heartbeat on tty /dev/ttyS0 (19200 baud)
Jan 18 16:11:16 megs heartbeat: [8314]: info: G_main_add_SignalHandler:
Added signal handler for signal 17
Jan 18 16:11:16 megs heartbeat: [8314]: info: Local status now set to: 'up'
Jan 18 16:11:17 megs heartbeat: [8314]: info: Link
192.168.5.1:192.168.5.1 up.
Jan 18 16:11:17 megs heartbeat: [8314]: info: Status update for node
192.168.5.1: status ping
Jan 18 16:13:16 megs heartbeat: [8314]: WARN: node stewie: is dead
Jan 18 16:13:16 megs heartbeat: [8314]: info: Comm_now_up(): updating
status to active
Jan 18 16:13:16 megs heartbeat: [8314]: info: Local status now set to:
'active'
Jan 18 16:13:16 megs heartbeat: [8314]: info: Starting child client
"/usr/lib/heartbeat/ipfail" (108,110)
Jan 18 16:13:16 megs heartbeat: [8314]: WARN: No STONITH device configured.
Jan 18 16:13:16 megs heartbeat: [8314]: WARN: Shared disks are not
protected.
Jan 18 16:13:16 megs heartbeat: [8314]: info: Resources being acquired
from stewie.
Jan 18 16:13:16 megs heartbeat: [8325]: info: Starting
"/usr/lib/heartbeat/ipfail" as uid 108 gid 110 (pid 8325)
Jan 18 16:13:16 megs heartbeat: [8326]: debug: notify_world: setting
SIGCHLD Handler to SIG_DFL
Jan 18 16:13:16 megs ipfail: [8325]: debug: Signing in with heartbeat
Jan 18 16:13:16 megs heartbeat: [8327]: info: No local resources
[/usr/lib/heartbeat/ResourceManager listkeys megs] to acquire.
Jan 18 16:13:16 megs heartbeat: [8314]: info: Initial resource
acquisition complete (T_RESOURCES(us))
Jan 18 16:13:16 megs heartbeat: [8314]: debug: StartNextRemoteRscReq():
child count 1
Jan 18 16:13:17 megs IPaddr[8543]: [8593]: INFO:
/usr/lib/heartbeat/send_arp -i 500 -r 10 -p
/var/run/heartbeat/rsctmp/send_arp/send_arp-192.168.5.199 eth1
192.168.5.199 auto 192.168.5.199 ffffffffffff
Jan 18 16:13:18 megs mach_down[8342]: [9349]: info:
/usr/lib/heartbeat/mach_down: nice_failback: foreign resources acquired
Jan 18 16:13:18 megs heartbeat: [8314]: info: mach_down takeover complete.
Jan 18 16:13:26 megs heartbeat: [8314]: info: Local Resource acquisition
completed. (none)
Jan 18 16:13:26 megs heartbeat: [8314]: info: local resource transition
completed.
Jan 18 16:13:48 megs heartbeat: [8314]: info: megs wants to go standby
[foreign]
Jan 18 16:13:58 megs heartbeat: [8314]: WARN: No reply to standby
request. Standby request cancelled.
Rob Morin
Dido Internet Inc.
Montreal,Canada
http://www.dido.ca
514-990-4444
Rob Morin wrote:
> OK so i just read on a post that i am not suppose to have heartbeat
> and my services start on boot up..... so i removed them from startup
> via the "update-rc.d -f heartbeat remove" command on my Debian
> system and then rebooted, everything came up????
>
> Hhmmm can i not add heartbeat as an S99heartbeat on start up rather
> than have to manually start it after boot?
>
> also i see this in my /proc/drbd n the primary
>
> version: 8.0.4 (api:86/proto:86)
> SVN Revision: 2947 build by root at stewie, 2008-01-17 12:26:19
> 0: cs:StandAlone st:Primary/Unknown ds:UpToDate/DUnknown r---
> ns:0 nr:0 dw:40 dr:1549 al:0 bm:0 lo:0 pe:0 ua:0 ap:0
> resync: used:0/31 hits:0 misses:0 starving:0 dirty:0 changed:0
> act_log: used:0/257 hits:10 misses:0 starving:0 dirty:0 changed:0
>
> And on secondary...
>
> version: 8.0.4 (api:86/proto:86)
> SVN Revision: 2947 build by root at megs, 2008-01-17 13:19:05
> 0: cs:StandAlone st:Secondary/Unknown ds:UpToDate/DUnknown r---
> ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0
> resync: used:0/31 hits:0 misses:0 starving:0 dirty:0 changed:0
> act_log: used:0/257 hits:0 misses:0 starving:0 dirty:0 changed:0
>
>
> Does this mean drbd is not working correctly?
>
>
>
>
> Rob Morin
> Dido Internet Inc.
> Montreal,Canada
> http://www.dido.ca
> 514-990-4444
>
>
>
> Rob Morin wrote:
>> Hello all, my first post here so be gentle....
>>
>> I install DRBD 8 via Debian package on Debian Etch...
>>
>> i pretty much got everything working, except.....
>>
>> When i reboot the primary nothing comes up.... i get this in the
>> heartbeat log.... The secondary is up all the time and does not have
>> control of anything...
>> Please let me know if more info or log file entries are needed....
>>
>> conf and log files are below.... i removed comments to shorten post
>> ---------------------------------------------------------------------
>> Heartbeat log file.....
>>
>> /dev/drbd0: Wrong medium type
>> INFO: Filesystem Success
>> INFO: IPaddr Success
>>
>>
>> drbd.conf file......
>>
>>
>> global {
>> usage-count yes;
>> }
>>
>>
>>
>> common {
>> syncer { rate 10M; }
>> }
>>
>>
>> resource web {
>>
>> protocol C;
>>
>> handlers {
>> pri-on-incon-degr "echo o > /proc/sysrq-trigger ; halt -f";
>>
>> pri-lost-after-sb "echo o > /proc/sysrq-trigger ; halt -f";
>>
>> local-io-error "echo o > /proc/sysrq-trigger ; halt -f";
>>
>> outdate-peer "/usr/sbin/drbd-peer-outdater";
>> }
>>
>> startup {
>> wfc-timeout 60;
>>
>> degr-wfc-timeout 120; # 2 minutes.
>> }
>>
>> disk {
>> on-io-error detach;
>>
>> }
>>
>> net {
>> after-sb-0pri disconnect;
>>
>> after-sb-1pri disconnect;
>>
>>
>> after-sb-2pri disconnect;
>>
>> rr-conflict disconnect;
>>
>> }
>>
>> syncer {
>> rate 100M;
>>
>> }
>>
>> on stewie {
>> device /dev/drbd0;
>> disk /dev/md2;
>> address 192.168.5.149:7788;
>> flexible-meta-disk internal;
>>
>> }
>>
>> on megs {
>> device /dev/drbd0;
>> disk /dev/md2;
>> address 192.168.5.151:7788;
>> meta-disk internal;
>> }
>> }
>>
>>
>>
>> ha.cf conf file......
>>
>> stewie:/etc# cat ha.d/ha.cf
>> logfacility daemon # This is deprecated
>> keepalive 1 # Interval between heartbeat (HB) packets.
>> deadtime 10 # How quickly HB determines a dead node.
>> warntime 5 # Time HB will issue a late HB.
>> initdead 120 # Time delay needed by HB to report a
>> dead node.
>> udpport 694 # UDP port HB uses to communicate
>> between nodes.
>> ping 192.168.5.1 # Ping VMware Server host to simulate
>> network resource.
>> serial /dev/ttyS0 # Which interface to use for HB packets.
>> auto_failback on # Auto promotion of primary node upon
>> return to cluster.
>> node stewie # Node name must be same as uname -r.
>> node megs # Node name must be same as uname -r.
>>
>> respawn hacluster /usr/lib/heartbeat/ipfail
>> # Specifies which programs to run at startup
>>
>> use_logd yes # Use system logging.
>> logfile /var/log/hb.log # Heartbeat logfile.
>> debugfile /var/log/heartbeat-debug.log # Debugging logfile.
>>
>> haresources file.....
>>
>> stewie IPaddr::192.168.5.199 drbddisk::web \
>> Filesystem::/dev/drbd0::/var/www::ext3::defaults apache2
>> ---------------------------------------------------------------------------------------------------
>>
>>
>> Also which is supposed to start first drbd or heartbeat? as drbd
>> starts first as per rc2.d
>>
>> lrwxrwxrwx 1 root root 14 2008-01-16 14:41 S70drbd -> ../init.d/drbd
>> lrwxrwxrwx 1 root root 19 2008-01-16 15:14 S75heartbeat ->
>> ../init.d/heartbeat
>>
>> A ps -ax shows this after the reboot, but nothing comes up the ip is
>> not enabled and my /var/www i snot mounted
>>
>> 2957 ? S 0:00 [drbd0_worker]
>> 2972 ? S 0:00 [drbd0_receiver]
>> 2987 ? S 0:00 ha_logd: read process
>> 2988 ? S 0:00 ha_logd: write process
>> 3112 ? SLs 0:00 heartbeat: master control process
>> 3119 ? Ss 0:00 /usr/sbin/atd
>> 3126 ? Ss 0:00 /usr/sbin/cron
>> 3139 ? SL 0:00 heartbeat: FIFO reader
>> 3140 ? SL 0:00 heartbeat: write: ping 192.168.5.1
>> 3141 ? SL 0:00 heartbeat: read: ping 192.168.5.1
>> 3148 ? SL 0:00 heartbeat: write: serial /dev/ttyS0
>> 3149 ? SL 0:00 heartbeat: read: serial /dev/ttyS0
>>
>> Thanks to all for your help.....
>>
> _______________________________________________
> drbd-user mailing list
> drbd-user at lists.linbit.com
> http://lists.linbit.com/mailman/listinfo/drbd-user