Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
> On Tuesday 06 April 2004 07:18, Jeff Goris wrote: > > > I am running an IDE software raid on each box as well, which I know > > > could have some impact on performance, although I'm guessing it should > > > be capable of sustaining more throughput than just 500k/s. > > > > I'm running a fairly similar configuration. RedHat 9.0, drbd-0.6.12, > > channel bonded gigabit crossover, with 90GB software RAID devices on each > > node and I came across this same problem when I set it up. > > > > > Any idea's? Thanks, > > > > Perhaps check the clocks on each machine. I did notice that the clocks on > > each of my hosts were different as one was set to the wrong timezone. I > > fixed the timezone and had each host sync off an NTP source to ensure they > > kept accurate time. Then the replication rate was stable the whole way > > through at about 22,500 K/sec. > > > > Jeff. > > Just an uneducated guess: > > Might it be that the software raid started its resync process ? > > -Philipp Your 'uneducated guess' certainly fits the symptoms. I did check the RAID device status some of the times the problem occured but this problem did happen when all the RAID devices were healthy and synchronised. At first I would take drbd down to stop the resync as it would have taken days otherwise. Eventually, since I couldn't find what was wrong I did let it try to resync at 500 K/sec. When synchronising and the rate dropped to 500 K/sec eventually one of the hosts would freeze completely, requiring pressing of the reset button. It was always the same host, regardless of whether I had set it to primary or secondary. I could only get it to complete the sync without crashing when I left the max sync rate at 2000K which it was able to maintain the entire time. I have seen the sync rate drop significantly on two other systems when the RAID device was also resyncing, but drbd has has always coped with this, although this was when using drbd-0.6.8 and earlier. I actually had put this problem aside for over a week due to other more urgent work. It was only when using rpm to install some security updates that I became aware of the time zone problem on the dodgy host. I'm not positive that I had changed nothing else on either host. It was immediately after fixing the time that I tried a max sync rate of 50 M/sec and managed to get it to sync fairly constantly at 22,500 K/sec. However, you have me thinking that your guess is right and that I may have been experiencing the problems due to software RAID resyncing at the same time, especially since after the host crashed the RAID would have to resync too. I will attempt to do a drbd resysnc at the same time as a software RAID resysnc and see how this goes. Jeff.