[DRBD-user] slow to stall when sync kicks in

Christophe Bouder Christophe.Bouder at lip6.fr
Wed Sep 7 16:05:57 CEST 2011

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Hi,
We have exactly the same problem with large files copy.
We're running on Debian 64 bit, with kernel 3.0.3, drbd 8.3.11 and ocfs2 
filesystem. We dedicated 2 10G ports to drbd.
Are you still encountering the same problem? Did you get some solutions 
from drbd team? Did you find some tricks on your own?

Thanks in advance,


Le 16/08/2011 01:24, Dennis Su a écrit :
> Hi,
> We have two servers built, different hardware and capacity, on 64 bit 
> centos 5.6 with drbd 8.3.8, the backing devices are gfs2 devices layer 
> on top of LVM, because different capacity on both nodes and that 
> limits our ability to play with the barriers effectively. The server 
> has two Gb ports, one connect to the network and the other connected 
> with a crossover cable for dedicated drbd sync. We are running a 
> number of tests to see how well they perform before putting into 
> production. During the weeks long testing, one thing doesn't seem 
> right and we've tried to play with the config to tweak it, but not 
> gaining anything, so we go back to default for most settings as shown 
> in the drbd show below.
> It is a duo primary setup on a 10TB device. The sync/resync operations 
> between nodes, without other activities on the drbd devices, is 63MB 
> (sustained) on both way. However, adding the write to either node 
> during the sync can cause the both write and sync speed to drop to 
> 2KB. The situation can happen on any new write operation as 
> well, initially the writing, first 10 seconds, are at the expected 
> speed, then drbd sync kicks in, can see it from the /proc/drbd, it 
> slows to almost stall, once the sync stopped, observed from from both 
> ifstat and /proc/drbd, the writing to the node resume to expected 
> speed, but once the sync start again the writing slows down. The 
> pattern repeats until the entire write operation is complete. We have 
> tested with small size files, the effect is minimum so no problems in 
> there, but we intent to use these serves to store large files, which 
> can be few GByte each. Initially, we though that high IO might be the 
> culprit, then we took drbd out of the picture and just runs 
> simultaneous read and write tests and they were fine with large and 
> small files. Now we think that drbd might need to locked the file to 
> perform the sync, during locked time the continues write stream on 
> the file is not permitted, then once the sync is done drbd releases it 
> for writing again. Then we tried tweaking the buffer and bio, no help. 
> Of course, this just a guess, but if it is true, is that any ways to 
> tweak drbd to perform better with big files.
> We also tried swapping a cross over with a straight cable no 
> improvements,
> Thanks in advance,
> Dennis
> ###################drbdsetup /dev/drbd0 show ##############
> disk {
>         size                    0s _is_default; # bytes
>         on-io-error             detach;
>         fencing                 dont-care _is_default;
>         max-bio-bvecs           0 _is_default;
> }
> net {
>         timeout                 60 _is_default; # 1/10 seconds
>         max-epoch-size          2048 _is_default;
>         max-buffers             2048 _is_default;
>         unplug-watermark        256;
>         connect-int             10 _is_default; # seconds
>         ping-int                10 _is_default; # seconds
>         sndbuf-size             0 _is_default; # bytes
>         rcvbuf-size             0 _is_default; # bytes
>         ko-count                0 _is_default;
>         allow-two-primaries;
>         cram-hmac-alg           "sha1";
>         shared-secret           "U$eP at sswd <mailto:U$eP at sswd>";
>         after-sb-0pri           discard-zero-changes;
>         after-sb-1pri           discard-secondary;
>         after-sb-2pri           disconnect _is_default;
>         rr-conflict             disconnect _is_default;
>         ping-timeout            5 _is_default; # 1/10 seconds
> }
> syncer {
>         rate                    112640k; # bytes/second
>         after                   -1 _is_default;
>         al-extents              257;
>         delay-probe-volume      16384k _is_default; # bytes
>         delay-probe-interval    5 _is_default; # 1/10 seconds
>         throttle-threshold      20 _is_default; # 1/10 seconds
>         hold-off-threshold      100 _is_default; # 1/10 seconds
> }
> protocol C;
> _this_host {
>         device                  minor 0;
>         disk                    "/dev/mapper/vg0-r0";
>         meta-disk               internal;
>         address                 ipv4 10.1.1.35:7788;
> }
> _remote_host {
>         address                 ipv4 10.1.1.29:7788;
> }
> ##############################
>
> Hemlock Printers, Ltd.
> (604) 439-5075
>
> <#>
>
>
> _______________________________________________
> drbd-user mailing list
> drbd-user at lists.linbit.com
> http://lists.linbit.com/mailman/listinfo/drbd-user


-- 
Christophe Bouder,


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20110907/a8211ac6/attachment.htm>


More information about the drbd-user mailing list