[DRBD-user] lockup on primary node

Lars Ellenberg Lars.Ellenberg at linbit.com
Fri Jul 16 21:16:12 CEST 2004

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


/ 2004-07-16 13:59:56 -0500
\ John Lange:
> On Fri, 2004-07-16 at 13:37, Lars Ellenberg wrote:
> > > internal interface (the one used for replication) is dropping a very
> > > high percentage of packets (60%).
> > 
> > doh. broken hardware can always result in strange behaviour...
> 
> So just confirming that it is not normal for the interface to drop
> packets even when fully saturated with replication.
> 
> I'm trying to talk the data centre into "giving" me a new cross over
> cable. If they say no then I will have to go on a road trip... :\
> 
> > > Here is cat /proc/drbd during syncing.
> > > 
> > > 0: cs:SyncingAll st:Secondary/Primary ns:0 nr:81069096 dw:81069096 dr:0
> > > pe:0 ua:17
> > >         [==============>.....] sync'ed: 71.7% (31375/110540)M
> > >         finish: 0:10:39h speed: 55,406 (50,666) K/sec
> > 
> > and this is which DRBD version ??
> > if it is anything < 0.6.12,
> > please try if 0.6.12 /0.6.13 changes anything in this behaviour.
> 
> It is drbd-0.6.12 . 
> 
> Is there any reason to believe that 0.6.13 might improve stability in
> this situation?

no. only "cosmetic" changes there, iirc.

have you any indication *what* may hang?
can you enable sysrq, and, when it hangs again,
hit sysrq-t (which is showTask)... and see which processes are in "D" state,
and where (this includes a stack backtrace of each process).
(if you hook up a serial console, you could capture the output)

what happens when you unplug the cable?
does the "hung" node recover then?

	Lars Ellenberg



More information about the drbd-user mailing list