[DRBD-user] System lockup with DRBD

chambal 2iow-li6l at dea.spamcon.org
Tue Nov 2 07:53:36 CET 2010

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.

"Robert Dunkley" <Robert at saq.co.uk> wrote:

>Can you try with Intel Nics installed in those Via boards? NICs would be
>my first choice if the problem is hardware related. I have used DRBD
>with Intel SSDs, works fine.

What hardware and OS/kernel did you use with the SSDs?

Thanks for the NIC idea.  Found some Intel PCI Ethernet cards but
they don't fit in the Mini-ITX, have ordered extenders so I can
try them.

In the meantime, checked the network drivers - the ones included
in CentOS5.5 for the Via Velocity (VT6120/VT6121/VT6122) show
V1.13 in the syslog startup messages.  Checking VIA's site, there
were newer ones for this chipset, the Linux part is V1.30.
Installed and made active, rebooted, verified it shows V1.30.

Unfortunately these didn't solve the lockup problem.  They did
solve a problem seen when I was doing intensive read/write
testing on the DRBD shared partition, where I saw frequent:

   eth1: excessive work at interrupt

in both the Primary and Secondary syslogs.  This newer driver
solved that, no more such messages.  But it doesn't solve the
core problem.

I am wondering if there is a combination of standard GNU/Linux
command-line tools that could be used in a script to work with
the disk and network to approximate how DRBD interacts with the
system.  If this were possible, and I could trigger the problem
this way, it would at least let me demonstrate that the problem
is not "something with DRBD".

>-----Original Message-----
>From: drbd-user-bounces at lists.linbit.com
>[mailto:drbd-user-bounces at lists.linbit.com] On Behalf Of chambal
>Sent: 29 October 2010 09:50
>To: drbd-user at lists.linbit.com
>Subject: Re: [DRBD-user] System lockup with DRBD
>chambal <2iow-li6l at dea.spamcon.org> wrote:
>>I have a pair of VIA M800 Mini-ITX with SSD (one OCZ
>>Vertex-Turbo, one Intel), and CentOS 5.5 with current patches.
>>When I have DRBD active on both units, at some random point but
>>always within one day, one of the units has completely locked up.
>>In all but one case, it's the Primary unit.
>>When I say locked up, I mean the PC is completely frozen -
>>keyboard is dead (can't toggle numlock, and Alt-SysRq - which is
>>enabled - doesn't work), there's no kernel panic dump on the
>>physical console, there's no response to tapping the power
>>switch, and it can't be pinged.  There's nothing in the syslog
>>after it's forcibly rebooted.
>>Possibly important clue: the front panel LED for hard disk
>>activity is solidly on when the failure occurs.
>>When I have DRBD running on only the active (Primary) unit (did
>>"service drbd stop" on the inactive (Secondary) unit), this
>>lockup never occurs.
>>There is not very much disk read/write activity on the shared
>>partition.  Both units are on the same local private LAN segment.
>>Originally I was using DRBD 8.0.1 (which didn't have this problem
>>on different much older hardware and OS), then updated to DRBD
>>8.0.16, then yesterday to 8.3.9.  No difference in the problem.
>>Because the kernel is 2.6.18-194.17.1.el5 I still have to use a
>>kernel module.
>>I am rather lost on how to proceed in tracking down the cause of
>>this problem or a solution.
>I received an email response from someone running the exact same CentOS
>5.5 and kernel version, and DRBD 8.3.9.  So this would seem to point to
>the hardware, or an interaction between the hardware and software.
>Has anyone run DRBD on a VIA EPIA-M800 Mini-ITX?
>Has anyone run DRBD on SSD?
>drbd-user mailing list
>drbd-user at lists.linbit.com
>The SAQ Group
>Registered Office: 18 Chapel Street, Petersfield, Hampshire GU32 3DZ
>SAQ is the trading name of SEMTEC Limited. Registered in England & Wales
>Company Number: 06481952
>http://www.saqnet.co.uk AS29219
>SAQ Group Delivers high quality, honestly priced communication and I.T. services to UK Business.
>Broadband : Domains : Email : Hosting : CoLo : Servers : Racks : Transit : Backups : Managed Networks : Remote Support.
>ISPA Member

More information about the drbd-user mailing list