Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Felix, thanks for your reply. I did an experiment yesterday. Here is what I`ve got. I created ten DRBD devices using default syncer rate(primary/primary configuration). I got the initial sync rate at about 1M/s per device. My client is a Windows 2008 server with IOMeter running on it. The default IO timeout is 60s. II reboot one node, the other takes over. IO runs smoothly until the other node re-synchronization starts. Application IO drops to nearly zero with re-sync rate of 500K/S per device. What confuses me is 500(k/s /device) * 10 (devices) = 5M/s. It only uses 5M/1000M = 0.5% of the overall bandwidth. Application IOs eventually are abandoned by client due to timeout which result in LUN reset.(all assigned devices went offline). As mentioned in your previous email, I did not put a device to primary until sync finishes. Again, here is my DRBD configuration resource drbd10 { on FA33 { device /dev/drbd10; disk /dev/disk/by-id/scsi-360030480003ae2e0159207cc2a2ac9d4; address 192.168.251.1:7799; meta-disk internal; } on FA34 { device /dev/drbd10; disk /dev/disk/by-id/scsi-360030480003ae32015920a821ca7f075; address 192.168.251.2:7799; meta-disk internal; } net { allow-two-primaries; after-sb-0pri discard-younger-primary; after-sb-1pri discard-secondary; after-sb-2pri violently-as0p; rr-conflict violently; max-buffers 8000; max-epoch-size 8000; unplug-watermark 16; sndbuf-size 0; } syncer { verify-alg crc32c; al-extents 3800; } handlers { before-resync-target "/sbin/before_resync_target.sh"; after-resync-target "/sbin/after_resync_target.sh"; } } Anyone encountered similar problem before? Commit yourself to constant self-improvement On Wed, Jun 22, 2011 at 3:34 AM, Felix Frank <ff at mpexnet.de> wrote: > On 06/22/2011 04:28 AM, Digimer wrote: > > Are all ten DRBD resources on the same set of drives? > > Good hint: if there *are* indeed 10 DRBDs, the syncer rate should of > course be 30% * THROUGHPUT / NUM_DRBDs, because each resource will use > the defined rate. I.e. in your case, some 30M. > > To the OP: Does the rebooted node become Primary before the sync is > complete? If so, you may want to try leaving it Secondary until > everything is back up in sync. > Requests to an Inconsistent node can cause network overhead. > > Cheers, > Felix > -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20110622/c02a1fd9/attachment.htm>