Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
On Mon, Mar 5, 2012 at 11:45 PM, Andreas Bauer <ab at voltage.de> wrote: > I can share an observation: > > (Disclaimer: my knowledge of the Linux I/O stack is very limited) > > Kernel 3.1.0, DRBD 8.3.11, DRBD->LVM->MD-RAID1->SATA DISKS > (Disks use CFQ scheduler) > > issue command: drbdadm verify all > (with combined sync rate set to exceed disk performance) > > The system will become totally unresponsive, up to the the point that all processes will wait longer than 120s to complete any I/O. In fact their I/O does not get through until I/O load from the DRBD Verify reduces because some volumes have completed their run. Sorry but this is a bit like: "Doctor, I poked a rusty knife into my eye..." "Yes?" "... and now I have a problem." "Well you already said that." If you're telling your system to use an sync/verify rate that you _know_ to be higher than what the disk can handle, then kicking off a verify (drbdadm verify) or full sync (drbdadm invalidate-remote) will badly beat up your I/O stack. The documentation tells you to use a sync rate that doesn't exceed about one third of your available bandwidth. You can also use variable-rate synchronization which should take care of properly throttling the syncer rate for you. But by deliberately setting a sync rate that exceeds disk bandwidth, you're begging for trouble. Why would you want to do this? The CFQ I/O scheduler is a bad choice for servers too, but that's probably the lesser of your concerns right now. Cheers, Florian -- Need help with High Availability? http://www.hastexo.com/now