[DRBD-user] Initial Sync - Fast then really slow

Jeff Goris jeff.goris at whiterabbit.com.au
Sun Apr 11 17:38:05 CEST 2004

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


> > That is correct. The machine is only doing DRBD - both machines are 
freshly 
> > installed with just DRBD and heartbeat setup. The problem occurs whether 
or 
> > not /dev/nb0 is mounted or unmounted. The only resources Heartbeat manages 
in 
> > the cluster is one DRBD device and one virtual IP address.
> 
> please reproduce it without heartbeat...  "by hand"
> Thanks.
> 
> 	Lars
> 

I managed to reproduce it again. /dev/nb0 was unmounted and heartbeat and DRBD 
were stopped on both hosts. Then checked that all RAID devices were healthy 
and that the time on both hosts were correct and synchronised. Started DRBD 
(command 'service drbd start') on the host that was last primary. Started drbd 
on the secondary and monitored the hosts durng the syncall of /dev/nb0. When 
finished, I set the clock on the secondary forward 8 hours with the 'date' 
command. Finally, I started a resync on the primary with the 
command 'drbdsetup /dev/nb0 replicate'. I monitored both RAID and DRBD during 
the resync until the secondary host locked up. I did not see the sync rate 
drop prior to the lockup.

I suspect now that the slow sync rate was due to the software RAID 1 also 
syncing as you "guessed" as the last two times I reproduced this lock up I did 
not see the sync rate drop. However, I am pretty sure that the locking up on 
the secondary occurs when it's system clock is drifting from a time in the 
future back to the correct time whilst DRBD is resyncing. I can't see what 
else could be causing the host to lock up whlst DRBD is resyncing. I've tried 
to stop everything running other than DRBD and NTPD.

If you are very sure that DRBD should not be failing under these conditions, 
then I think I will need to try a fresh "minimal" install of RedHat without 
software RAID and without channel bonding and try introducing components and 
see if I can ascertain which component is causing the problem.

Cheers,
Jeff.




More information about the drbd-user mailing list