Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hello, i am investigating why our server pairs reboot themselves from time to time. This is very annoing because these machines are in production and i always have to fix mysql replications or drbd splitbrains after these reboots. We have 3 pairs that use a drbd/xen/heartbeat setup and 2 of these pairs crash, sometimes every 2 week sometimes only twice a year. I first thought it could be heartbeat, but I stopped the service on 1 pair and we also had a crash. Are there other people who had these kind of crashes? I dont even know if it is a crash, i never can find anything in my logfiles about problems, or about heartbeat that does a safety reboot. this is one drbd.conf entry: resource drbd_backend { protocol C; startup { degr-wfc-timeout 120; # 2 minutes. } disk { on-io-error detach; } net { } syncer { rate 500M; al-extents 257; } on xen-B1.fra1 { device /dev/drbd0; disk /dev/md3; address 172.20.2.1:7788; meta-disk internal; } on xen-A1.fra1 { device /dev/drbd0; disk /dev/md3; address 172.20.1.1:7788; meta-disk internal; } } this the ha.cf debugfile /var/log/ha-debug logfile /var/log/ha-log logfacility local0 keepalive 2 deadtime 60 #warntime 10 initdead 120 udpport 694 ucast eth0 172.20.1.1 ucast eth0 172.20.2.1 auto_failback on node xen-A1.fra1 node xen-B1.fra1 and this the xen config debugfile /var/log/ha-debug logfile /var/log/ha-log logfacility local0 keepalive 2 deadtime 60 #warntime 10 initdead 120 udpport 694 ucast eth0 172.20.1.1 ucast eth0 172.20.2.1 auto_failback on node xen-A1.fra1 node xen-B1.fra1 can you please give me some assistance? greetings Rupert -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20090603/05da2d0c/attachment.htm>