Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
On Fri, Feb 20, 2009 at 2:37 PM, Victor Hugo dos Santos <listas.vhs at gmail.com> wrote: > Hello, > > I have a problem with drbd-0.7.25 and drbd-0.8.2.6... my situation is: > > two servers Supermicro in company A connected with crossover cable and > CentOS 5.2 (all updates installed) > two servers Poweredge in company B connected with network fiber in > separate sites and Citrix XenServer 4 installed. > > the problem is that time in time, both servers restart without > apparent reason.. in logs, only show messages about network failure > and after this, server restart. > in company A... this problem occurred 2 o 3 times and the last > incident is on 4 months ago.. > and I had forget this problem.. because, I think that could be for > electrical energy line in this company. > but now, in company B.. I have the same problem for first time (after > various months work fine) and this servers is connected in UPS line. > > two servers groups are running a Virtualization Server.. but from > different vendors and configurations.. > Memory, disks and network work fine in four servers and, DRBD resource > contain only data from VMs, none files/data from owner server. > > and I don't understand why servers restart when recive a error from > network !!??? > and in case of problem..I think that restart of VMs is probably but > not of complete Server. > > Above, logs and config file of two servers in company B... hello, one time more, I have the same problem, but in other company and other hardwares and others configurations !! :D My problem: one node restart (not exists errors in logs files) and after a couple seconds, the other node restart too. In the second node I have this lines: ================ May 8 16:31:46 blueback kernel: drbd1: PingAck did not arrive in time. May 8 16:31:46 blueback kernel: drbd1: peer( Secondary -> Unknown ) conn( Connected -> NetworkFailure ) pdsk( UpToDate -> DUnknown ) May 8 16:31:46 blueback kernel: drbd1: asender terminated May 8 16:31:46 blueback kernel: drbd1: Terminating asender thread May 8 16:31:46 blueback kernel: drbd1: short read expecting header on sock: r=-512 May 8 16:31:46 blueback kernel: drbd1: Creating new current UUID May 8 16:31:46 blueback kernel: drbd0: PingAck did not arrive in time. May 8 16:31:46 blueback kernel: drbd0: peer( Primary -> Unknown ) conn( Connected -> NetworkFailure ) pdsk( UpToDate -> DUnknown ) May 8 16:31:46 blueback kernel: drbd0: asender terminated May 8 16:31:46 blueback kernel: drbd0: Terminating asender thread May 8 16:31:46 blueback kernel: drbd0: short read expecting header on sock: r=-512 May 8 16:38:19 blueback kernel: drbd: initialised. Version: 8.2.7 (api:88/proto:86-88) May 8 16:38:19 blueback kernel: drbd: GIT-hash: 61b7f4c2fc34fe3d2acf7be6bcc1fc2684708a7d build by root at localhost.localdomain, 2009-02-19 13:47:49 ================ and I don't understand the origin of problem: this is my actual configuration: - two server poweredge R900, with 64G ram and 8 CPUs - two resources drbd (drbd0 and drbd1) - one gigabit nic exclusive for drbd This is version of softwares: XenServer 5.0 (update 2) with kernel 2.6.18-92.1.10.el5.xs5.0.0.404.646xen DRBD Version: 8.2.7 (api:88) and my drbd.conf is: ================ # drbdadm dump # /etc/drbd.conf resource disco1 { protocol B; on sockeye.multiexportfoods.com { device /dev/drbd0; disk /dev/sdb1; address ipv4 10.0.1.70:7788; meta-disk internal; } on blueback.multiexportfoods.com { device /dev/drbd0; disk /dev/sdb1; address ipv4 10.0.1.60:7788; meta-disk internal; } disk { on-io-error detach; max-bio-bvecs 1; } syncer { rate 50M; al-extents 257; } startup { wfc-timeout 60; degr-wfc-timeout 10; } } resource disco2 { protocol B; on sockeye.multiexportfoods.com { device /dev/drbd1; disk /dev/sdb2; address ipv4 10.0.1.70:7789; meta-disk internal; } on blueback.multiexportfoods.com { device /dev/drbd1; disk /dev/sdb2; address ipv4 10.0.1.60:7789; meta-disk internal; } disk { on-io-error detach; max-bio-bvecs 1; } syncer { rate 1G; al-extents 257; } startup { wfc-timeout 60; degr-wfc-timeout 10; } } ================ any, any idea ?? please !! :-) thanks. -- -- Victor Hugo dos Santos Linux Counter #224399