Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
I have an application that requires O_SYNC. Writing 4K blocks at a time, I see 70us overhead and I'm hoping someone in this mailing list can help me explain it. Another issue is the optimal block size. I chose 4K for performance reasons, but really wanted to use 1K. My tests write to an ext2 file system (4k block size) with no ext2 extents or writes directly to the DRBD block device with no filesystem(/dev/drbd). With ext2, I ensure no ext2 metadata(inode/extents) is written during the tests. Each test below was run for 10 seconds on a 10 Meg file writing in 4K increments with either O_SYNC or O_DIRECT. All tests below have the same results for O_SYNC or O_DIRECT either through the ext2 filesystem or the DRBD's raw block device. - With a ram disk on both Primary and Secondary, I see 130us to write one 4K block. - Note: I wanted to benchmark network performance of DRBD here. on PRIMARY { | on SECONDARY { disk /dev/ram0; | disk /dev/ram0; meta-disk internal | meta-disk internal - Primary hard disk and Secondary down, I see 250us to write one 4k block. - Note: I wanted to benchmark local performance here. on PRIMARY { disk /dev/sda5; meta-disk internal - With DRBD on using hard disks on Primary/Secondary, I see 450us to write one 4K block. - Note: I would expect 380us, disk performance (250us) + network (130us). on PRIMARY { | on SECONDARY { disk /dev/sda5; | disk /dev/sda5; meta-disk internal | meta-disk internal - With a hard disk on the Primary and a ramdisk on the Secondary, I see 260us to write one 4K block. - Note: I concluded here from the tests above there is an extra 70us overhead on the Secondary drbd/disk block IO subsystem. on PRIMARY { | on SECONDARY { disk /dev/sda5; | disk /dev/ram0; meta-disk internal | meta-disk internal - With DRBD's metadata on a ramdisk on both Primary and Secondary and using local hard drives on Primary and Secondary, I see 450us to write one 4K block. - Note: I concluded here that DRBD's metadata is not the performance issue. on PRIMARY { | on SECONDARY { disk /dev/sda5; | disk /dev/sda5; meta-disk /dev/ram1[0] | meta-disk /dev/ram1[0] I turned on drbd's tracing on the Secondary for one 4K write and here is what I see. drbd2_receiver [6671] data <<< Data (sector 196608s, id ffff810354d437b8, seq 187921, f 0) drbd2_receiver [6671] data <<< UnplugRemote (7) drbd2_asender [6764] meta >>> WriteAck (sector 196608s, size 4096, id ffff810354d437b8, seq 188238) drbd2_receiver [6671] meta >>> BarrierAck (barrier 2539266504) drbd2_receiver [6671] data <<< Barrier (barrier 2539266504) The question I have are - In the DRBD trace above, at what point does an application on the Primary return from write(2). Is it after WriteAck or BarrierAck or Barrier? - Has anyone noticed a drop in performance/latency when using O_SYNC or O_DIRECT? Is the BarrierAck/Barrier adding to the 70us performance drop? Should I expect disk perf + network perf as the total latency, or is there more overhead I should account for? In my test case above, disk/perf is 260us, drbd/network only is 130us, I see 450us, but expect 380us. - What is the optimal block size for DRBD via mkfs -t ext2 -b<blocksize>? I would like to use 1024, but the Secondary performs a disk Read/Modify/Write when using 1K. Is the Secondary managing blocks in 4K?