[DRBD-user] drbd-0.7.0 with linux-2.4. slow?

Thu Jul 29 12:45:26 CEST 2004

/ 2004-07-29 01:10:31 +0200
\ Bernd Schubert:
> > can you post more verbose test results on some website, or here?
> > which file system?
> 
> The filesystem is reiserfs.

btw, what did drbd report about its estimated
syncer performance during initial full sync, or during any other sync?
(grep for "drbd.: Resync done" in syslog)

do you run iozone on some client on the nfs mount,
or on the host itself on a direct mount?

[ snip iozone output ]

I'd also be interessted in
 rm wol.dat rwol.dat
 iozone -s 4m -r 4k -i0 -o -c -e -Q
 iozone -s 1g -r 1m -r4m -i0 -o -c -e -Q
and the resulting output of
 awk '
  /^ +[0-9]+ +[0-9]+$/ {
	if ($2 > max) max=$2;
	if (min==0 || $2 < min) min=$2;
	sum+=$2;
	N++;
  }
  /^$/ {
	printf "N:\t%8d\nmin:\t%8d\navg:\t%8d\nmax:\t%8d\n",
		N, min, sum/N, max;
	N=min=max=sum=0;
  }' wol.dat rwol.dat

(latency figures...)

generally you want to include -c -e when running on nfs.

you are aware that -i0 only tests "linear" access.
for more interessting figures, include -i2 and/or -i8
(random/random_mix)

> I guess that we have to much memory for this test (3GB), so the numbers 
> are slightly unrealistic [for drbd-unconnected iozone-async ...]

do they change with -c -e ?

> [snip]
> 
> > > Well, so I'm asking here if someone here has an idea?
> > > Somehow I would like to try using protocol A or B instead of the current
> > > protocol C. Its worth a try, isn't it?
> > > Is it sufficient to change the protocol in the configuration files and
> > > then restart drbd?
> > > Lars, referring to your mail to LINUX-HA we should use drbd-0.7.1 for any
> > > other protocol than C? Thats not problem, but I don't want to reboot the
> > > server with a new kernel until the weekend.
> >
> > lacking standardized bechmarks, we did not yet settle on the slightly
> > arbitrary value we want to use to kick the lower level device on a
> > secondary with proto A or B... but yes, it should already be better than
> > what we have with 0.7.0.
> 
> O.k. I will try it over the weekend.
> 
> >
> > in any case, using DRBD adds more latency to your setup.
> > so if you have had an io latency on a normal disk that is basically
> > bound by rotation freq and seektime (say, 7 ms),
> > you now add in network latency and the io-latency on the second node...
> > for random writes, performance degrades considerably.
> 
> Of course, but that doesn't explain why 2.6.7 did so much better. 

now, 2.4 only has *one* thread to flush *all* devices.
2.6. can basically flush all devices "in parallel" ...
what backing storage device(s) did you use?

process scheduler latency as well as interrupt latency may have an
impact (HZ value). io-scheduler may have an impact.
maybe something has changed in the implementation of the network stack,
or likely in the nfs implementation, also.

> Unfortunately I do not have any iozone benchmark numbers for 2.6.7,
> maybe I have time to create them during the weekend (now that our server went
> into production and since people also work over the weekend, my group 
> won't like too many reboots ;) ).
> 
> >
> > as I said, lacking data, I can not really comment on whether protocol A
> > or B with current drbd svn (on the way to 0.7.1) will help.
> >
> > you need to benchmark it yourself.
> > please post your findings.
> 
> O.k., I will do that. Switching between the protocols won't do any harm 
> to the filesystem, will it?

well...
if it does, its a serious bug.
it did not happen to me so far.

what you can do to try and tune drbd:
play with sndbuf-size
     increasing it may help throughput,
     decreasing it may help latency,
     ... or vice versa ... it depends ...
play with the mtu of the link (jumbo frames)
play with max-buffers

	Lars Ellenberg

-- 
please use the "List-Reply" function of your email client.