Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hi all, I've recently upgraded to DRBD 8.4.3 (protocol C) on CentOS 6.4 (kernel 3.10.10) with Xen 4.3.0 on hardware RAID10 with an Infiniband 20Gbit/sec replication link. For a few days now, we've been experiencing a very strange issue whereby (seemingly randomly) the system will become almost unresponsive, with iowait going to 100% on some (but not all) domUs and dom0, but even the domUs whose load remains stable will still be incredibly sluggish. The problem occurs even when the resources are in standalone mode. Sometimes it self-corrects, but it's becoming more severe and is now less likely to go away without a reboot. Earlier today, the system running as primary was at 0.02 load, and the slave (which was doing nothing other than receiving updates from the master, no domUs running) went to 13 load and was pretty much dead. I've tried a variety of tuning options, including enabling disable_sendpage, but nothing is making it any better. Nothing is printed to the logs. My next thought is to try downgrading to DRBD 8.3, but considering support ends in December, I'd much prefer to continue using 8.4. I'm very much hoping that someone more experienced than myself will be able to offer some words of wisdom. :) Thanks Regards, Stephen Marsh