Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hello,
i am investigating why our server pairs reboot themselves from time to time.
This is very annoing because these machines are in production and i always
have to fix mysql replications or drbd splitbrains after these reboots.
We have 3 pairs that use a drbd/xen/heartbeat setup and 2 of these pairs
crash,
sometimes every 2 week sometimes only twice a year.
I first thought it could be heartbeat, but I stopped the service on 1 pair
and we also had a crash.
Are there other people who had these kind of crashes?
I dont even know if it is a crash, i never can find anything in my logfiles
about problems, or about heartbeat that does a safety reboot.
this is one drbd.conf entry:
resource drbd_backend {
protocol C;
startup {
degr-wfc-timeout 120; # 2 minutes.
}
disk {
on-io-error detach;
}
net {
}
syncer {
rate 500M;
al-extents 257;
}
on xen-B1.fra1 {
device /dev/drbd0;
disk /dev/md3;
address 172.20.2.1:7788;
meta-disk internal;
}
on xen-A1.fra1 {
device /dev/drbd0;
disk /dev/md3;
address 172.20.1.1:7788;
meta-disk internal;
}
}
this the ha.cf
debugfile /var/log/ha-debug
logfile /var/log/ha-log
logfacility local0
keepalive 2
deadtime 60
#warntime 10
initdead 120
udpport 694
ucast eth0 172.20.1.1
ucast eth0 172.20.2.1
auto_failback on
node xen-A1.fra1
node xen-B1.fra1
and this the xen config
debugfile /var/log/ha-debug
logfile /var/log/ha-log
logfacility local0
keepalive 2
deadtime 60
#warntime 10
initdead 120
udpport 694
ucast eth0 172.20.1.1
ucast eth0 172.20.2.1
auto_failback on
node xen-A1.fra1
node xen-B1.fra1
can you please give me some assistance?
greetings
Rupert
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20090603/05da2d0c/attachment.htm>