Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hello, I have a mailstore that is about 1.2TB in size that I need to migrate from one set of storage to another. In order to avoid a large amount of time offline while the data copies, I am planning to use DRBD to replicate the bulk of the data while the mail system is online. I believe I have the process for this down, but I am concerned about the loss in performance I am seeing when DRBD is enabled. My test setup is two Dell PE2950 servers, each having a fully populated MD1000 array attached. There are fifteen 300GB 10K drives in a RAID10 configuration. The servers are connected to a 1Gbit network and are connected to the same HP switch, which is not being utilized for anything else at the moment. I am using bonnie++ to test performance, using the following command: $ bonnie++ -u root -d /repl/ -s0 -n 256:10k:1k:128 -f When run on each of the servers, here is what I get: ===== SERVER A ===== Version 1.03 ------Sequential Create------ --------Random Create-------- serverA -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete-- files:max /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP 256:10:0 22818 67 35094 39 22575 55 19946 57 36000 38 15274 42 ==================== ===== SERVER B ===== Version 1.03 ------Sequential Create------ --------Random Create-------- serverB -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete-- files:max /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP 256:10:0 22733 65 35465 40 22735 54 20955 60 36114 39 15237 42 ==================== When setup for replication from server A to server B, here is what I get: ==================== Version 1.03 ------Sequential Create------ --------Random Create-------- serverA -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete-- files:max /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP 256:10:0 19270 57 18322 34 13527 43 18790 56 18523 31 8223 34 ==================== My drbd.conf file looks like this: ==================== resource r0 { protocol C; incon-degr-cmd "echo 'DRBD: pri on incon-degr' | wall ; sleep 10"; startup { wfc-timeout 0; degr-wfc-timeout 120; } disk { on-io-error detach; } net { # max-buffers 20480; # max-epoch-size 16384; sndbuf-size 512K; } syncer { rate 512M; group 1; al-extents 1024; } on serverA { device /dev/drbd0; disk /dev/vg1/repl; address 10.103.5.150:7788; meta-disk /dev/vg0/drbd-meta[0]; } on serverB { device /dev/drbd0; disk /dev/vg1/repl; address 10.103.5.151:7788; meta-disk /dev/vg0/drbd-meta[0]; } } ==================== I have tried using the different protocols as well as various other settings but nothing seems to really have much impact. So, I am curious what the best steps to take next are in terms of finding the bottleneck. Thanks, robert