Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Not having much luck pursuing this - on a newer OS, I cannot even run DRBD - I get an immediate kernel panic. I was hopeful that a newer OS on this VIA EPIA-M800 hardware (with OCZ Vertex-Turbo SSD) might solve the intermittent lockup problem. I chose Fedora 13. I loaded it from the DVD, put drbd-8.3.9.tar.gz on it and did: ./configure --prefix=/usr/local --sbindir=/usr/local/sbin --localstatedir=/var --sysconfdir=/etc --without-heartbeat --without-pacemaker --without-xen make clean make make install chkconfig --add drbd Cleared and initialized storage via: dd if=/dev/zero bs=1M count=1 of=/dev/sda2 drbdadm create-md r0 When I then did "service drbd start", I got a kernel panic. I tried 8.3.7 instead, same result. Went back to 8.3.9, added "--with-km" to configure, same result. I played with the configuration file - if it didn't define a valid resource, no kernel panic, otherwise it crashes. I also tried "yum update" which took the kernel from 2.6.33.3-85.fc13.i686.PAE to 2.6.34.7-61.fc13.i686.PAE and after again compiling/installing drbd, it still crashes. The syslog never has a record of the panic details. The strange thing is, the older CentOS 5.5 with drbd 8.3.9 on the same hardware works fine except for the within-a-day lockup problem. Configuration: NetworkManager service turned off, network on, ifcfg-eth1 has IP=10.0.1.151, sysconfig/network has hostname set to f13-1.sync, hosts file has that IP and name. Partner unit is not yet set up. drbd.conf (minimal for testing): resource r0 { protocol C; on f13-1.sync { device /dev/drbd1; disk /dev/sda2; address 10.0.1.151:7788; meta-disk internal; } on f13-2.sync { device /dev/drbd1; disk /dev/sda2; address 10.0.1.152:7788; meta-disk internal; } } chambal <2iow-li6l at dea.spamcon.org> wrote: >"Robert Dunkley" <Robert at saq.co.uk> wrote: > >>Can you try with Intel Nics installed in those Via boards? NICs would be >>my first choice if the problem is hardware related. I have used DRBD >>with Intel SSDs, works fine. > >What hardware and OS/kernel did you use with the SSDs? > >Thanks for the NIC idea. Found some Intel PCI Ethernet cards but >they don't fit in the Mini-ITX, have ordered extenders so I can >try them. > >In the meantime, checked the network drivers - the ones included >in CentOS5.5 for the Via Velocity (VT6120/VT6121/VT6122) show >V1.13 in the syslog startup messages. Checking VIA's site, there >were newer ones for this chipset, the Linux part is V1.30. >Installed and made active, rebooted, verified it shows V1.30. > >Unfortunately these didn't solve the lockup problem. They did >solve a problem seen when I was doing intensive read/write >testing on the DRBD shared partition, where I saw frequent: > > eth1: excessive work at interrupt > >in both the Primary and Secondary syslogs. This newer driver >solved that, no more such messages. But it doesn't solve the >core problem. > > >I am wondering if there is a combination of standard GNU/Linux >command-line tools that could be used in a script to work with >the disk and network to approximate how DRBD interacts with the >system. If this were possible, and I could trigger the problem >this way, it would at least let me demonstrate that the problem >is not "something with DRBD". > >> >>-----Original Message----- >>From: drbd-user-bounces at lists.linbit.com >>[mailto:drbd-user-bounces at lists.linbit.com] On Behalf Of chambal >>Sent: 29 October 2010 09:50 >>To: drbd-user at lists.linbit.com >>Subject: Re: [DRBD-user] System lockup with DRBD >> >>chambal <2iow-li6l at dea.spamcon.org> wrote: >> >>>I have a pair of VIA M800 Mini-ITX with SSD (one OCZ >>>Vertex-Turbo, one Intel), and CentOS 5.5 with current patches. >>> >>>When I have DRBD active on both units, at some random point but >>>always within one day, one of the units has completely locked up. >>>In all but one case, it's the Primary unit. >>> >>>When I say locked up, I mean the PC is completely frozen - >>>keyboard is dead (can't toggle numlock, and Alt-SysRq - which is >>>enabled - doesn't work), there's no kernel panic dump on the >>>physical console, there's no response to tapping the power >>>switch, and it can't be pinged. There's nothing in the syslog >>>after it's forcibly rebooted. >>> >>>Possibly important clue: the front panel LED for hard disk >>>activity is solidly on when the failure occurs. >>> >>>When I have DRBD running on only the active (Primary) unit (did >>>"service drbd stop" on the inactive (Secondary) unit), this >>>lockup never occurs. >>> >>>There is not very much disk read/write activity on the shared >>>partition. Both units are on the same local private LAN segment. >>> >>>Originally I was using DRBD 8.0.1 (which didn't have this problem >>>on different much older hardware and OS), then updated to DRBD >>>8.0.16, then yesterday to 8.3.9. No difference in the problem. >>>Because the kernel is 2.6.18-194.17.1.el5 I still have to use a >>>kernel module. >>> >>>I am rather lost on how to proceed in tracking down the cause of >>>this problem or a solution. >> >>I received an email response from someone running the exact same CentOS >>5.5 and kernel version, and DRBD 8.3.9. So this would seem to point to >>the hardware, or an interaction between the hardware and software. >> >>Has anyone run DRBD on a VIA EPIA-M800 Mini-ITX? >> >>Has anyone run DRBD on SSD? >> >>_______________________________________________ >>drbd-user mailing list >>drbd-user at lists.linbit.com >>http://lists.linbit.com/mailman/listinfo/drbd-user >> >>The SAQ Group >> >>Registered Office: 18 Chapel Street, Petersfield, Hampshire GU32 3DZ >>SAQ is the trading name of SEMTEC Limited. Registered in England & Wales >>Company Number: 06481952 >> >>http://www.saqnet.co.uk AS29219 >> >>SAQ Group Delivers high quality, honestly priced communication and I.T. services to UK Business. >> >>Broadband : Domains : Email : Hosting : CoLo : Servers : Racks : Transit : Backups : Managed Networks : Remote Support. >> >>ISPA Member