Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hi, Again I had the chance to set up a DRBD Cluster for a LINBIT Customer. It was the first time I a had one of these new SATA devices really under my fingers [ withouh any of these enterprise ready RAID controlers, which are in reality rather slow... ] The machines are some DELLs with a single P4 w 2.8 GHz and an "Intel Corp. 6300ESB SATA Storage Controller" IDE Controller, two "Intel Corp. 82547GI Gigabit Ethernet Controller" NICs and 512 MB RAM, the disk calles itsef "ST380011AS" Seagate Baracuda. At first the performance of the disk was miserable, in the area of ~5 MB/sec, as it turned out the reason for this was because we used the LINUX's common IDE (PATA). Then we tried the libata/ata_piix driver combination, and suddenly we got a write performance in the area of 40MB/sec. BTW, with libata suddenly the disk appear as SCSI disk ! [ -> changing all config files from "hdc" to "sda" ] Networksetup: e1000 driver, the machines connected with a straight cat5E cable, forced the cards into "speed 1000" with ethtool, and set the MTU to 9000 aka Jumboframes. I am interested in raw data throughput, so I did sequential writes on an ext3 Filesystem. Test1 I wrote a 1GB File (with sync) to the root partition [Cyl: 250 to 1466] 3 times: 43.35 MB/sec (1073741824 B / 00:23.621594) 40.43 MB/sec (1073741824 B / 00:25.328009) 40.78 MB/sec (1073741824 B / 00:25.112768) avg: 41.52 Test2 The I did the same on a connected DRBD device (protocol C), also ext3: [Cyl: 2747 to 6151] 39.05 MB/sec (1073741824 B / 00:26.226047) 35.95 MB/sec (1073741824 B / 00:28.483070) 36.48 MB/sec (1073741824 B / 00:28.068153) avg: 37.16 At first I was satisfied with the outcome that DRBD [with protocol C] costs you about 10% of your throughput with sequential writes. Test3 But the I did the same test with DRBD disconnect and got these numbers: [Cyl: 2747 to 6151] 39.63 MB/sec (1073741824 B / 00:25.840004) 40.30 MB/sec (1073741824 B / 00:25.406312) 39.82 MB/sec (1073741824 B / 00:25.713998) avg: 39.91 I aked myself: Why is it 4% below the first test ? Assumption: Maybe because the mirrored partition is behind the root partition, and harddisk are slower on the outer cylinders than on the inner cylinders. Test4: So I unloaded the DRBD module and mounted the backing storage devices on the mountpoints directly! [Cyl: 2747 to 6151] 39.65 MB/sec (1073741824 B / 00:25.823633) 38.54 MB/sec (1073741824 B / 00:26.570280) 37.26 MB/sec (1073741824 B / 00:27.479914) avg: 38.48 Test3 was 3.5% faster than Test4. This could be explained by the fact that DRBD sometimes triggers the immediate write of buffers to disk. The DRBD mirroring overhead, thus Test4 to Test2 is 3.4% which is smaller then the performance differences within the disk device Test1 to Test4 is 7.3% CPU Usage: I monitored CPU Usage on the secondary system using the "top" utitily and the hightes value for the drbd_receiver was 7.7%. Resync performance: For the Customer I configured the syncer to run with 10MB/sec, this makes sure that the Customer's application will continue to run during a resync operation. For testing purpose I set the resync rate to 40M and got a resync rate in the area of 33MByte/sec. Effect of JumboFrames / 9000 MTU I repeated Test2 with an MTU of 1500 Byte and got these numbers: 36.27 MB/sec (1073741824 B / 00:28.234703) 36.22 MB/sec (1073741824 B / 00:28.270442) 36.41 MB/sec (1073741824 B / 00:28.121841) On the secondary system the CPU system time's highest point was 7%, and the spotted maximum on the drbd_receiver thread was 9.7% So it seems the the JumboFrames only ease the task of the secondary node, but do not improve performance. Test Setup: Linux 2.6.9 DRBD 0.7.5 Writes were this command: ./dm -a 0x00 -o /mnt/ha0/1Gfile -b1M -m -p -y -s1G dm is from drbd-0.7.5.tar.gz in the benchmark directory. Conclusion: =========== The Performance inhomogeneity within a single disk drive can be bigger (measusred 7.3%) than the loss of performance caused by DRBD mirroring (measured 3.4%). This only holds true it the limiting factor is the performance of the disk. In other words: Your network link and your CPU needs to be strong enough. -phil -- : Dipl-Ing Philipp Reisner Tel +43-1-8178292-50 : : LINBIT Information Technologies GmbH Fax +43-1-8178292-82 : : Schönbrunnerstr 244, 1120 Vienna, Austria http://www.linbit.com :