Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
On Fri, Feb 20, 2009 at 2:37 PM, Victor Hugo dos Santos
<listas.vhs at gmail.com> wrote:
> Hello,
>
> I have a problem with drbd-0.7.25 and drbd-0.8.2.6... my situation is:
>
> two servers Supermicro in company A connected with crossover cable and
> CentOS 5.2 (all updates installed)
> two servers Poweredge in company B connected with network fiber in
> separate sites and Citrix XenServer 4 installed.
>
> the problem is that time in time, both servers restart without
> apparent reason.. in logs, only show messages about network failure
> and after this, server restart.
> in company A... this problem occurred 2 o 3 times and the last
> incident is on 4 months ago..
> and I had forget this problem.. because, I think that could be for
> electrical energy line in this company.
> but now, in company B.. I have the same problem for first time (after
> various months work fine) and this servers is connected in UPS line.
>
> two servers groups are running a Virtualization Server.. but from
> different vendors and configurations..
> Memory, disks and network work fine in four servers and, DRBD resource
> contain only data from VMs, none files/data from owner server.
>
> and I don't understand why servers restart when recive a error from
> network !!???
> and in case of problem..I think that restart of VMs is probably but
> not of complete Server.
>
> Above, logs and config file of two servers in company B...
hello,
one time more, I have the same problem, but in other company and other
hardwares and others configurations !! :D
My problem: one node restart (not exists errors in logs files) and
after a couple seconds, the other node restart too. In the second node
I have this lines:
================
May 8 16:31:46 blueback kernel: drbd1: PingAck did not arrive in time.
May 8 16:31:46 blueback kernel: drbd1: peer( Secondary -> Unknown )
conn( Connected -> NetworkFailure ) pdsk( UpToDate -> DUnknown )
May 8 16:31:46 blueback kernel: drbd1: asender terminated
May 8 16:31:46 blueback kernel: drbd1: Terminating asender thread
May 8 16:31:46 blueback kernel: drbd1: short read expecting header on
sock: r=-512
May 8 16:31:46 blueback kernel: drbd1: Creating new current UUID
May 8 16:31:46 blueback kernel: drbd0: PingAck did not arrive in time.
May 8 16:31:46 blueback kernel: drbd0: peer( Primary -> Unknown )
conn( Connected -> NetworkFailure ) pdsk( UpToDate -> DUnknown )
May 8 16:31:46 blueback kernel: drbd0: asender terminated
May 8 16:31:46 blueback kernel: drbd0: Terminating asender thread
May 8 16:31:46 blueback kernel: drbd0: short read expecting header on
sock: r=-512
May 8 16:38:19 blueback kernel: drbd: initialised. Version: 8.2.7
(api:88/proto:86-88)
May 8 16:38:19 blueback kernel: drbd: GIT-hash:
61b7f4c2fc34fe3d2acf7be6bcc1fc2684708a7d build by
root at localhost.localdomain, 2009-02-19 13:47:49
================
and I don't understand the origin of problem:
this is my actual configuration:
- two server poweredge R900, with 64G ram and 8 CPUs
- two resources drbd (drbd0 and drbd1)
- one gigabit nic exclusive for drbd
This is version of softwares:
XenServer 5.0 (update 2) with kernel 2.6.18-92.1.10.el5.xs5.0.0.404.646xen
DRBD Version: 8.2.7 (api:88)
and my drbd.conf is:
================
# drbdadm dump
# /etc/drbd.conf
resource disco1 {
protocol B;
on sockeye.multiexportfoods.com {
device /dev/drbd0;
disk /dev/sdb1;
address ipv4 10.0.1.70:7788;
meta-disk internal;
}
on blueback.multiexportfoods.com {
device /dev/drbd0;
disk /dev/sdb1;
address ipv4 10.0.1.60:7788;
meta-disk internal;
}
disk {
on-io-error detach;
max-bio-bvecs 1;
}
syncer {
rate 50M;
al-extents 257;
}
startup {
wfc-timeout 60;
degr-wfc-timeout 10;
}
}
resource disco2 {
protocol B;
on sockeye.multiexportfoods.com {
device /dev/drbd1;
disk /dev/sdb2;
address ipv4 10.0.1.70:7789;
meta-disk internal;
}
on blueback.multiexportfoods.com {
device /dev/drbd1;
disk /dev/sdb2;
address ipv4 10.0.1.60:7789;
meta-disk internal;
}
disk {
on-io-error detach;
max-bio-bvecs 1;
}
syncer {
rate 1G;
al-extents 257;
}
startup {
wfc-timeout 60;
degr-wfc-timeout 10;
}
}
================
any, any idea ?? please !! :-)
thanks.
--
--
Victor Hugo dos Santos
Linux Counter #224399