Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
/ 2004-07-16 13:59:56 -0500 \ John Lange: > On Fri, 2004-07-16 at 13:37, Lars Ellenberg wrote: > > > internal interface (the one used for replication) is dropping a very > > > high percentage of packets (60%). > > > > doh. broken hardware can always result in strange behaviour... > > So just confirming that it is not normal for the interface to drop > packets even when fully saturated with replication. > > I'm trying to talk the data centre into "giving" me a new cross over > cable. If they say no then I will have to go on a road trip... :\ > > > > Here is cat /proc/drbd during syncing. > > > > > > 0: cs:SyncingAll st:Secondary/Primary ns:0 nr:81069096 dw:81069096 dr:0 > > > pe:0 ua:17 > > > [==============>.....] sync'ed: 71.7% (31375/110540)M > > > finish: 0:10:39h speed: 55,406 (50,666) K/sec > > > > and this is which DRBD version ?? > > if it is anything < 0.6.12, > > please try if 0.6.12 /0.6.13 changes anything in this behaviour. > > It is drbd-0.6.12 . > > Is there any reason to believe that 0.6.13 might improve stability in > this situation? no. only "cosmetic" changes there, iirc. have you any indication *what* may hang? can you enable sysrq, and, when it hangs again, hit sysrq-t (which is showTask)... and see which processes are in "D" state, and where (this includes a stack backtrace of each process). (if you hook up a serial console, you could capture the output) what happens when you unplug the cable? does the "hung" node recover then? Lars Ellenberg